Monad of No Return: Next Steps

The proposal monad of no return has been in the process of being implemented for a while. I’m collecting community feedback on the proposal (and specifically phase 3) in the hopes that as a community we can move forwards in making Haskell a better language. I encourage people to read the proposal above, but I’ll summarise and quote below where relevant.

What is this proposal?

The proposal outlines that with making Applicative a superclass of Monad (which happened with Functor-Applicative-Monad proposal pre-2015), the definitions of return and (>>) in the Monad typeclass are redundant, as they can always be lawfully expressed in terms of pure and (*>) respectively. As such, they can and (by this proposal) should be removed from the typeclass, and defined in terms of their Applicative equivalents.

Why should we do this?

Having return and (>>) introduces redundancy, redundancy that

violates the “making illegal states unrepresentable” idiom: Due to the default implementation of return this redundancy leads to error-prone situations which aren’t caught by the compiler; for instance, when return is removed while the Applicative instance is left with a pure = return definition, this leads to a cyclic definition which can be quite tedious to debug as it only manifests at runtime by a hanging process. [1]

Additionally, keeping return and (>>) as Monad methods means that in cases where a weaker Applicative constraint would suffice we instead get a Monad constraint.

Currently the status quo is to optimise (>>) over (*>), which means that there are unexpected performance regressions[1] when generalising code; this is particularly noteworthy in the case of Foldable/Traversable methods, and additionally blocks the collation of their methods (for example mapM vs traverse) too.

A minor benefit would be to reduce the sizes of the class dictionaries, as well as avoid additional indirection through Monad when Applicative will suffice.

What’s been done so far?

So far we’ve completed phase 1 of the migration strategy. Phase 1 introduced a warning when non-lawful instances of return and (>>) are defined. This was implemented in GHC 8.0, and added to the default warning set in GHC 9.2.

What’s next?

Phase 2 moves return and (>>) to be top level definitions, and lets GHC ignore lawful definitions of return and (>>) in Monad. As part of this, non-lawful definitions would result in a compile time error, while lawful definitions are left alone.

Phase 3 would then start warning about lawful implementations, similar to phase 1’s warning on unlawful implementations.

Phase 4 would finish off the proposal by removing GHC support for return and (>>) overrides, turning any such override into a compile time error.

So why are you posting about this?

It’s been a while since there’s been much update on this front, which is a shame for a proposal that would make Haskell a cleaner and more coherent language to program in. For many, the move to phase 2 won’t impact them, as unlawful definitions have been warned against for a time. Phase 3 has the potential to impact more people, so that is the primary focus of this post.

Why is phase 3 a sticking point?

It isn’t! What is currently lacking is community input in order to make the changes that phase 3, and this proposal, represents digestible and easily understood by the majority of Haskell users. We’re in no rush to break setups or have confused beginners ask why their tutorials are failing them.

What are we meant to do about this?

Getting feedback from the community is the primary purpose of this post, but I’m also taking the opportunity to ask for volunteers to split the work and make progress on this issue. If you want to take part, please let people know! GHC is a community project after all.

25 Likes

I personally prefer return over pure and hope it will stays somehower.
Having said that I’ve been thinking of something for a while and it seems the perfect opportunity to talk about it.

What about making return a keyword and start a new block (with or without BlockArguments extension, so that one can write return x+y instead of return (x+y) (or return $ x + y).

return will likely be staying forever; at the end of this proposal, it will probably be a top level definition of return = pure.

Adding more keywords, or making return a keyword is drastically outside of the scope of this proposal. If you want to make that happen, feel free to make your own proposal to GHC.

13 Likes

Why not skip phase 2 (i.e., do phases 2 and 3 together)?

Once you’re using a GHC that is far enough along this sequence that the in-Monad definitions are optional, there’s no reason to include them other than maintaining non-CPP compatibility with older GHCs. The only difference between phases 2 and 3 is suppressing a warning about this. But what good does that do? Suppressing the warning only delays the point at which the ecosystem transitions away from return-ful Monad. It doesn’t make the breaking change between phases 3 and 4 any less breaking. We have to eat that cost eventually, if we want this plan to succeed at all. Why not start nudging people toward the transition as soon as the alternative is made available?

8 Likes

I think I agree personally, but this revised proposal migration scheme was used to mete out warnings and breakage over a longer period of time. If more people speak up here (or like your comment) it would be worth combining phases 2 and 3 (pending CLC approval as well of course).

2 Likes

I’d be happy to give a hand on some tickets :slight_smile:

1 Like

I wonder if we could make “Phase 4” be a language extension that gets added to a future GHCXXX set. I’m not sure if this is feasible, but it would be very nice in terms of a smooth transition, since it would allow old lawful code to compile indefinitely. Ofc it also comes with a maintenance cost for GHC devs. I guess it depends on how much code uses this stuff and how much of it is still maintained.

1 Like

So “no extensions” GHC would allow the overrides, but “default extensions” GHC would remove support for them?
While the idea for backwards compatibility is attractive I dislike the idea of having to support these warnings and overrides into the future; phase 4 as presented would hopefully clean up the remaining gubbins.
Really, we should look into current packages and see how many of them define return or (>>), which would be good to do now or as phase 2/3 implementation begins.

4 Likes

Yeah that’s right. If you specify a recent enough language edition then it would not be allowed.

I just grepped for ^\s*return\s*= on Hackage. I think that will give us a pretty accurate idea of how many packages would need to be changed. That gives me 1006 packages: Packages that contain "return =" · GitHub

Grepping for ^\s*return = pure got me 452: Packages that contain "return = pure" · GitHub

(I used GitHub - nh2/hackage-download: Script to download all of Hackage and rg)

1 Like

Could you make a list with the following match?
^\s*return\s*(\S+\s*)?=(?!\s*pure\s*)

Since a package could have more than one Monad instance (e.g. for different types) and a package having at least one return = pure doesn’t mean it has no return definitions with something else, right?

That seems to catch things like returningClause = as well. Also I’m not sure how best to collate the results.

Perhaps you could run the search yourself. You can download all of Hackage using this script: GitHub - nh2/hackage-download: Script to download all of Hackage

2 Likes

Speedy research! I’ll see if I can replicate such research later to see what the common themes are of packages. If the majority of the 452 packages are small, old, or a “leaf” package (no one really depends on it) I think that would likely be fine.

Having downloaded hackage with hackage-download, I’ve used rg '^\s+return\s*=' -l | sed 's|\([^/]+\)*/.*$|\1|g' | sort -u | wc -l to count the packages.

More interesting to me is that 431 packages still define pure = return

1 Like

Ah, right, should have been
^\s*return(\s+\S+)?\s*=(?!\s*pure\s*)

the \S is doing odd things; removing it (and ending up with the command rg '^\s*return(\s+)?\s*=(?!\s*pure\s*)' -l --pcre2 | sed 's|\([^/]+\)*/.*$|\1|g' | sort -u | wc -l) gives 565 for me.

1 Like

I was trying to also have it match on return x = ... where ... doesn’t start with pure. (EDIT: I think I know why now… because \S also matches on =… so maybe [^\s=] would work? :thinking: )
But the 565 hits already shows that some of the 452 packages with return = pure also have return definitions that are not return = pure.

1006 - 452 = 554 packages that define return =, but do not define return = pure
565 packages define a return = (something other than pure)

Because I teach Haskell in courses, I have incentive to dislike “return” because it takes extra effort (both me and students) to note that it does not mean control flow unlike all other languages.

Actually “pure” is only a less worse name. IMO “fromPureToApplicative” or “toApplicativeFromPure” would be much better in terms of self-documenting names, but who else is going to agree? And the ship has sailed anyway.

P.S. I don’t have any objection to the phases and the pace.

4 Likes

Another question: other than Monad’s documentation (which I’m sure could be annotated in a way that makes the situation clear), is there any user-visible difference between:

  • current plan for phase 2:
    • move return and (>>) to top-level
    • deprecate -Wnoncanonical-monad-instances, which is now a no-op
    • add new logic to GHC to ignore any canonical but redundant definitions
  • a modified phase 2:
    • leave return and (>>) where they are (defer their relocation to phase 4)
    • change the current -Wnoncanonical-monad-instances code to raise an error instead of a warning
    • make the -Wnoncanonical-monad-instances flag a no-op and deprecate it

As I see it, both approaches newly forbid non-canonical definitions and still allow canonical definitions. The implementation cost seems lower in the second approach, since the existing infrastructure for the warnings is repurposed instead of having to add the concept of a class member that can be defined even if it doesn’t appear in the class. It is also a closer match to what is happening in the semigroup-monoid proposal, which is very similar and should probably be implemented in parallel—that proposal doesn’t move mappend out of Monoid until phase 4.

1 Like

Thanks for engaging with the community.

The only developers who I suspect might be miffed by this are industrial users
with deployment on a large number of machines (for sure I recall seeing some
complaints by such users in haskell-cafe with regard to the
Applicative-Monad proposal).

I guess small steps and a clear timeline were introduced to be more
palatable to them.

1 Like

I think I like this approach, along with your previous suggestion to merge phases 2 and 3. When this post is about a week old I’ll ask the CLC if we can amend the proposal for this more streamlined timeline.

This would mean that the new phase 2 would:

  • make noncanonical implementations of return and (>>) an error
  • make canonical implementations a warning

And the new phase 3 would:

  • move return and (>>) to the top level
    • by default this will make definitions of them a compile time error
  • optionally: investigate Teo’s suggestion
2 Likes

The only developers who I suspect might be miffed by this are industrial users

Sadly not. Also miffed will be any inexperienced user who comes across an old(ish) Haskell codebase in the future, tries to compile it, fails, has no idea why, and gives up, possibly leaving Haskell for good.

4 Likes