Feedback requested about breakage: type variable binders in datatypes

rae · March 21, 2022, 9:03pm

The GHC Steering Committee is considering accepting https://github.com/ghc-proposals/ghc-proposals/pull/425, a proposal to add new syntax allowing more explicit datatype and class declarations. For example, instead of writing

class Category cat where ...

we can now write (optionally!)

class Category @k (cat :: k -> k -> Type) where ...

Note the new @k bit. This new feature is fully optional, but it will allow clarity in a number of declarations, and there are a few obscure scenarios where it allows us to make type definitions that were previously impossible to express.

Along with this primary change, the proposal cleans up a number of related facilities. These cleanups will encourage more readable code and will lead to better error messages. However, there are breaking changes. In light of recent worries about breaking changes, we want to reach out to the broader community seeking feedback before we accept this proposal.

The breaking changes can be seen at ghc-proposals/0000-decl-invis-binders.rst at decl-invis-binders · serokell/ghc-proposals · GitHub, which can be read without reading the rest of the proposal. There are 4 separate breakages, all of which only affect code that use relatively fancy type-level features.

One will be delayed by three releases, as its fix is not backward compatible. Code that works today will continue to work for three more releases, but then it wouldn’t. The fix will be available from the first release that implements the proposal, so an author can apply the fix at any time during the next three release cycles.
One breakage is very obscure but has no migration path. If anyone has written code that triggers this, they would have to either use CPP or rather creative type shenanigans to keep their code compiling both before and after this change. Remember that the code is very obscure!
Two breakages have backward-compatible mitigations (that also improve readability). Code that works today would stop working on the release the implements the proposal, but the fix can be applied, and code with the fix compiles for several GHCs (reaching back into the middle of the GHC 8 series).

All of the details, including examples, are linked above. Note that each breakage is optional. We don’t have to do this, but we think the language will be better afterwards than before.

What does the community think here? The committee is broadly in favor. But would the breakage be too disruptive? If you think this would be problematic, it would be very helpful to us if you could point to a specific package (or better, module & line number!) that would be affected.

Even if we go forward at this point, once the change is applied, we will test against head.hackage and report back if that experience suggests we re-examine.

Vlix · March 21, 2022, 10:02pm

Can I first of all say I very much appreciate the call for feedback. I would probably have not noticed this was going on otherwise.

I personally wouldn’t be against the proposal, in principle at least. Being explicit is a good thing, I feel, and in some situations also makes type shenanigans feel less like “type magic”. I’d like to know what the impact on the current ecosystem would be first, though, and if this would degrade the type level programming experience in general (will errors get worse? will it become more difficult? etc.)

This also sounds like a prime candidate for a custom program that can scan source code to find spots that would (potentially) be impacted. And maybe even suggest the fix? Simplest would be a parser, but using .hie files might make this even more reliable?

Other thoughts: might also be a nice feature for such a program to be able to set flags so you can filter on locations that are 100% certainly impacted, “probably” impacted and maybe also a flag that makes the scan SUPER conservative in reporting so you can be sure that if this last setting returns NOTHING, then the scanned code DEFINITELY will not be impacted.

For example, in this case, if the code has no type synonyms and no type family declarations ANYWHERE, that code will definitely not be affected. And the 100% certain filter might use .hie files to know for sure?

I have a question about this code snippet in the proposal, though.

type family F a :: k
type instance F @(j -> j) Int = Any :: j -> j

Seeing the “type definition” (type family F a :: k) would make me think the @(j -> j) would set the a type variable, so this one confuses me a bit. Wouldn’t the “type definition” then not also have to be declared as type family F @k a :: k?

Because in the example in Breakage 2 shows

type F :: Type -> forall k. Maybe k
type family F x @k where ...

where the order is F x @k, and not F @k x.

(Also also, would you then not have to add the :: j -> j because the F @(j -> j) Int = Any already has all the information?)

EDIT: I think I figured out why the @k is the first argument to F. It’s because there’s an implicit forall k. a -> k in the type family F a :: k definition, right? (type family F :: forall k. a -> k)
Apologies for the somewhat off-topic section, but since this might provide some sort of insight to someone, I’ll leave it in.

Another problem I can definitely see happening is that error messages are going to degrade to a heaping pile of bad, and consequently make the type level programming experience worse. Could anyone provide good arguments that this will not happen?

So far my feedback. Hope it helps

rae · March 22, 2022, 2:52am

Thanks for the responses!

@Vlix

A migration/detection tool would indeed be fantastic, but we are not proposing writing one right now. That’s a significant undertaking. Maybe someday – maybe you want to write one? – but as of now, the proposal is not conditioned on this work.

You figured out the strange snippet. The problem is that, today, we can’t even write the type family F @k a :: k syntax that would make this explicit.

Another problem I can definitely see happening is that error messages are going to degrade to a heaping pile of bad

Why? For all four examples of breakage, we’re confident that we can detect the situation and present helpful error messages. I can’t make blanket guarantees, but I see no reason to believe that messages will get worse. Generally speaking, adding the ability for something to be more explicit helps error messages, because it increases the bandwidth from user to GHC. Making GHC be cleverer (by allowing more to be left implicit) generally makes messages worse. This proposal is solidly about explicitness, so I expect error messages to become better, not worse.

@atravers

(Response to now-deleted post constructively suggesting different syntax for the proposal. I had said this thread was more intended to be about breakage.)

Ericson2314 · March 22, 2022, 4:20am

I think the plan is good, and the best thing to do is wait and see with the head.hackage results. It is very hard to discuss these things one way or the other without good evidence.

Abab9579 · March 22, 2022, 4:59am

This was my mistake:
Why use @? I find it quite confusing with type application - you need to see which side it lies from = to get the meaning. Perhaps I might not be the intended audience…

Ericson2314 · March 22, 2022, 5:02am

As @rae wrote, this about migration plans, not the proposal itself. If you have concerns about that, they should go in the ghc-proposal thread on github instead: Invisible binders in type declarations (under review) by int-index · Pull Request #425 · ghc-proposals/ghc-proposals · GitHub

Ericson2314 · March 22, 2022, 5:03am

@Abab9579: as @rae wrote, this about migration plans, not the proposal itself. If you have concerns about that, they should go in the ghc-proposal thread on github instead: Invisible binders in type declarations (under review) by int-index · Pull Request #425 · ghc-proposals/ghc-proposals · GitHub

Abab9579 · March 22, 2022, 5:06am

Oh, okay. sorry for my mistake.

rae · March 22, 2022, 12:51pm

While I do want to keep this thread about breakage – and not about the main proposal payload – I also will answer these other questions here. Not everyone feels comfortable e.g. posting in the ghc-proposals repo, and so this might be the only place these questions are raised. I would love for more to feel comfortable in the ghc-proposals repo, but my desire for this does not make it the case.

@Abab9579 I confess not to be sure exactly why @ was chosen. This was the syntax proposed in our paper describing visible type application, but that’s not an answer, because it doesn’t say why that paper proposed that symbol. I believe GHC used @ for some time to denote type applications when pretty-printing Core, but I still don’t know why. @sweirich might know more history here.

tomjaguarpaw · March 22, 2022, 1:38pm

To me @ reads very naturally: id @T: the identity function at type T. Or have I missed the point?

tarleb · March 23, 2022, 10:45am

I’m not affected by any of these changes (and they look very reasonable), so I just want to say that I appreciate all the work that goes into making Haskell even better.

telser · March 26, 2022, 8:18pm

Something I find concern with here is that it raises the amount of things that must be learned. Specifically, one must now about type applications in multiple cases.

When trying to onboard someone to the language and working in an existing codebase this will be more to cover. While code that I typically work on and bring people into the language with use type families sparingly, it seems easy for someone to trip over. Thinking more about the last listed change being easy to run into.

Another piece for beginners is if this allows the use of @ syntax without an extension, I can imagine that being confusing with the TypeApplications extension.

None of this is to say I am strictly against this change. But I do think keeping in mind the amount of syntax for newcomers and their overall experience is worthwhile.

jaror · March 26, 2022, 8:29pm

The proposal says the new syntax is guarded behind the TypeAbstractions extension.

telser · March 26, 2022, 8:32pm

I missed that part somehow, thank you for pointing it out!

Tarmean · March 29, 2022, 5:35pm

To make sure I understand correctly, implicit kind matching on the argument side only remains legal? By definition this is ‘fully determined by the left hand side’ but from the example section it seems like the kind argument would remain unconstrained and this wouldn’t typecheck:

    type H :: forall x. forall (a :: x) -> Type
    type family H a where
       H Int = Int
       H (->) = (Int, Int)

I think type families where the result kind depends on some argument kind are reasonably common in my own code, but I usually use type families to torture the type system not to be productive. I use generic-lens quite often and it seems to have a couple occurrences, though, would this code remain legal? https://hackage.haskell.org/package/generic-lens-core-2.2.1.0/docs/src/Data.Generics.Internal.GenericN.html#Indexed

type family Indexed (t :: k) (i :: Nat) :: k where
  Indexed (t a) i = Indexed t (i + 1) (Param i a)
  Indexed t _     = t

ketzacoatl · March 29, 2022, 7:29pm

In my onboarding experience, I found this topic (to do X you first need to learn Y and Z) to be quite a bit of a challenge. At times it was exhausting. I also find that experienced haskellers don’t seem to remember what this was like, or understand what it’s like for new haskellers onboarding. From what I’ve seen, it seems academic-minded persons taking haskell on, experience these challenges as FUN and like playing a game. That’s great, and I applaud you. However, for those of us who are both less-experienced, and also production-minded, working to “get shit done”, or “bringing haskell to bear” on our problems, they are obstacles that prevent us from getting to where we’re trying to go, and adding more to that pile is disheartening.

I can’t really speak to this breakage topic directly, but I would like to advocate for the leaders of the community to better understand what it’s like for those of us who experience these “learn Y and Z to do X” (where X is something like “interact with JSON”) as obstacles.

Ericson2314 · March 29, 2022, 9:52pm

@ketzacoatl I would make the opposite point. Without this change, I find it really hard to to understand type families and data families to this day.

Simple cases are fine, since they work with a simpler mental model, but that will remain as the case.

Complicated case however are totally crazy today, and require all this imagination to fill in the @ to really be explicit at all. I don’t think that is good for new users in the slightest.

The new rules might require writing more in a few cases, and in those cases yes there is more code and that could scare readers of the code at first. But once they get over that hurdle (and again, hopefully new users don’t need to deal with these complex cases in the first place!) it’s just a matter of understanding the code that exists, not this more complicated exercise of imputing all the missing stuff that must exist in the compilers internals for things to make sense.

As far as education and onboarding is concerned, I am always more pro-explicitness. By the time users reach type families at all, they should well well enough versed in the far more important basics to not immediately swoon from any @. Then, they should very carefully work through all the explicit code, making sure to understand each bit, but with the confidence that there is nothing more to understand than what’s there.

Conversely, optimizing for “less scary code” might allow some folks to fake it without understanding what’s going on, but who I really worry about is those that don’t even realize those that are getting by with a false simplified mental model. They are being lulled into a false sense of security but will be rudely started when the complexity finally bursts forth later.

Ericson2314 · March 29, 2022, 9:56pm

The bottom line of Invisible binders in type declarations (under review) by int-index · Pull Request #425 · ghc-proposals/ghc-proposals · GitHub is and Modern Scoped Type Variables (under review) by goldfirere · Pull Request #448 · ghc-proposals/ghc-proposals · GitHub are taking what is a complete mess of ad-hoc incrementally evolved design, that takes perhaps a thorough study of Haskell’s history to fully understand, and replacing it with clear consistent rules, and a clear notion of what the “fully explicit” version of more abbreviated short-hands look like.

Yes, this stuff is also needed for Dependent Haskell, but to me that is really not the point — cleaning up the language is far more important than making it more expressive.

I expect the result of all this to in fact be a huge benefit to Haskell pedagogy, and I look forward to seeing it happen.

ketzacoatl · March 29, 2022, 10:15pm

Thanks for the explanations @Ericson2314.

I also read @telser’s comment as saying that “type applications” would be applicable / required understanding in new or additional places where it is not today. Are you saying that is not the case, or not applicable b/c it isn’t required understanding for “the simple case”?

Ericson2314 · March 29, 2022, 10:45pm

In a select few case @ is required, e.g. involving polymorphic “return” types. The proposal has the details. It still requires -XTypeAbstractions so without that one looses expressive power rather than having weird implicit stuff.

Those cases without the @ look very weird, e.g. as if we are mapping one input to multiple outputs! I wouldn’t want new users to think we violated the most basic property of functions!

Topic		Replies	Views
Clarifying dependent types	37	5427	April 19, 2022
Need feedback on pattern signature behavior	33	1942	October 3, 2023
The future of Dependent Haskell	117	5355	December 17, 2024
GHC+DH Weekly Update #4, 2023-01-18	17	1484	January 20, 2023
Type Inference behaving non intuitively with Functional Dependencies Learn	18	799	March 7, 2024

Feedback requested about breakage: type variable binders in datatypes

Related topics