Monad of No Return: Next Steps

The difficulty is that the benefits and costs are incommensurable. That is, unlike “will it be quicker to drive to my destination via route A or route B”, where the costs and benefits are both measurable in the same unit (time), i.e. commensurable, it’s not clear that we can measure the benefit (cleaner fundamental definitions) against the cost in time and frustration of fixing broken packages.

In general I’m in favour of asking the supporters of proposals to submit patches fixing breakage. If backwards-compatible patches have been submitted (regardless of acceptance) to all necessary packages in some privileged set (say Stackage) then I think that’s a very strong indicator that we should consider the stability concern dealt with. Private code is still a concern, and we should take it into account too. I’m not sure exactly what to do about that.

Sometimes see people post “I came back to Haskell after 10 years and now my program doesn’t compile – what should I do?”. I’m not willing to say that we should be willing to break their code just because they’ve been out of the community for 10 years. On the other hand I don’t think that should be our top concern, but we should consider it. I’m not proposing an alternative choice of action, just trying to broaden the scope of things we consider.

4 Likes

So, supposing M is a type for which \m k -> m >>= \_ -> k has ‘better’ operational semantics than \a1 a2 -> (id <$ a1) <*> a2 (not all types!), under the status quo (>>) is okay but (*>) is already bad. In such cases, it’s likely that someone would have noticed this at some point and placed a (*>) = (>>) definition (or equivalent) in there, which, yes, I do treat as a flag to be careful with what’s going on, as I mentioned previously.

If there really is no explicit (*>) definition, then the proposal does make (>>) bad as well, it’s true. But it’s not quite as dire as ‘everything was good, proposal makes it bad’—more ‘there was bad in one operator, but nobody noticed; proposal spreads the effect to a second, and maybe now people will notice it.’

1 Like

I don’t think this would be valid, since this would require (*>) to depend on a Monad instance existing which isn’t always the case.

As much as it annoys me to do so, it may make sense to (when we get to the last phase of moving methods out of the classes) define m >> k = m >>= \_ -> k as the top level definition if we are worried about the performance implications of making (>>) = (*>). If the (<*>) method is defined with ap, they’ll be fine, or if they have some other optimisation that feels (>>=)-like (like if [] <*> _ = [], then [] *> [1..] is fine performance wise).
Regardless, this would be an issue for that future phase which has to wait a while for the new warnings to percolate.

It feels a little cruel that rhendric (or others) has to attempt to patch every package that has ignored a warning for the past 4/5-10 years, when the warning was an attempt to move people towards fixing this issue.

We’re not breaking their code just because they’ve been out of the community; their code is breaking because the language has evolved while they weren’t here, and that’s ok. Additionally, their code will still work if they use any previous version of GHC. These changes first have to get merged into GHC, then released, and then that version of GHC has to be stable enough that people start using it. If you boot up an old project in that time, you’ll then get a warning/error that tells them what changed and why, which is as much hand holding as we could possibly give them in debugging their code.

Correct, not all M. But I suspect (I haven’t checked) this is the case for all widely-used monads, and the only ones it’s not true for are “lazy” variants of monads like Control.Monad.Trans.X.Lazy, that are hardly ever used because they’re too weird in general.

Yes indeed, and that’s what I did at #33: *> must be defined in instances to prevent space leak with forever :: hub.darcs.net.

Correct. I think the problem can occur exactly when (*>) has the default definition. I expect this to be “most” naively-implemented monads, i.e. ones where the authors just wanted to make a definition and be done with it, which in turn is probably most monads implemented outside of widely-used libraries. I also expect that, for historical reasons, (>>) is more widely used than (*>) in monadic code, but I’m less confident in that expectation.

EDIT: Oh, and I think that “my 100k line codebase used to run in constant memory and now it leaks space” wold be a really, really bad outcome, and I don’t consider “oh, I’ve already audited all my own Applicative instances, and now I’ve got to go and audit all the instances in my 1m lines of transitive dependencies” a satisfactory mitigation. Such an outcome would be much, much worse than a “neater” class hierarchy would be good.

4 Likes

I don’t understand. In cases where there’s no Monad instance the monad of no return proposal is irrelevant.

I don’t think “cruel” is the correct choice of word here!

Some people want something to change because they think the outcome will be better. Some other people will be worse off, however. If people in the former group are willing to put in the effort so that people in the latter group are not worse off then that removes a source of objection. If they’re not willing, the objection remains. It’s up to them if they want to challenge the objection through other means.

That also comes with significant risks, since it is potentially behaviour changing!

What I was responding to was your claim that:

To put my claim another way: I’m not willing to say that this change shouldn’t be held back because maintainers haven’t looked at their package warnings in a while. Perhaps it should!

1 Like

Ah, that looks like our crux then. I agree that if all widely-used monads with no explicit (*>) definition suffer from similar issues (leaks memory when used with forever?), then we need to have a hard look at cutting that part of the proposal.

I just checked, and Identity doesn’t leak. But maybe Identity doesn’t count?

Nor does NonEmpty, and that’s less of a special case.

Nor Either a.

I don’t know, which widely-used monads do you have in mind?

1 Like

I think I misunderstood; I thought you were proposing a new default definition of (*>).

Apologies, I think I was venting frustrations.
I think some source of this is that it’s hard to accept that people haven’t fixed these warnings when they’ve existed for so long - except if they haven’t updated their package in a long time, or they’re specifically choosing to ignore them. I think this is a difference of opinion.

By the time that phase comes about, we’ll have to make sure that maintainers are ready for their (>>) definitions to be correctly lined up, and it may even be worth looking how many folks define (>>) vs (*>).

This is demoralising to hear but understandable.

One smaller change we could make is make the move for just return instead of (>>) as well, since there shouldn’t be any performance concerns for that, and defer other changes until more research is performed. This would obviously not be my preferred solution, as it delays implementation further as well as complicating it.

1 Like

My impression from doing GHC upgrades on a medium-large corporate codebase is that a large cohort of maintainers while active do not fix warnings and mostly only adapt to upstream breaking changes when a PR is opened.

I think this is a reasonable approach given their capacity. Keeping your package compiling with no warnings takes a lot more effort. And especially redundant import warnings can be a pain to get rid of without loads of CPP. When your main workflow is via online CI systems, they are also easy to miss.

I agree that placing all the work on contributors is too much. I can imagine that this discourages people from contributing. It would be good to have an explicit group of people who are willing to submit patches and take this burden off contributors.

A good thing about the system as it stands, is that we do make sure patches get submitted before the new version of GHC comes out. In the past, I’ve seen bad situations where stuff only starts getting patched when Stackage nightly comes out and then everything gets delayed.

2 Likes

For what it’s worth:

Thanks for working on this proposal! I think “things”/“concepts” should have one name, and this name should be used consistently. If a name changes, we should warn when somebody is using the old name.

This is clear for “pure” vs “return”, but less so for operators *> vs >> (I am also less informed about the second case, and there seem to be important technical issues to consider as outlined above).

2 Likes

Ah, well, to be clear, I didn’t say that widely-used monads leak space. Quite the opposite in fact. I expect widely-used monads to not leak space (because they have the “better” implementations of (*>) already).

I said that for widely-used monads, \m k -> m >>= \_ -> k is better than \a1 a2 -> (id <$ a1) <*> a2. Let’s have a look at an example. If we newtype Maybe and use Maybe’s (*>) (a handwritten one) there’s no stack leak. If we use the default (*>) there is a stack leak. (I expect most little-used Monad definitions in the wild to pick up the default (*>).)

This is actually quite subtle! forever in base doesn’t trigger the leak. It has a special definition to avoid that. But less carefully written combinators (like lots in my code below) will trigger the leak. EDIT: it also subtly depends on optimization levels, which is something that always concerns me when it comes to space leaks.

forever     :: (Applicative f) => f a -> f b
{-# INLINE forever #-}
forever a   = let a' = a *> a' in a'
-- Use explicit sharing here, as it prevents a space leak regardless of
-- optimizations.
{-# LANGUAGE GHC2021 #-}
{-# LANGUAGE LambdaCase #-}

import Control.Monad
import System.Environment
import Data.Coerce

newtype MyMaybe1 a = MyMaybe1 (Maybe a)
  deriving (Functor, Show)

instance Applicative MyMaybe1 where
  pure :: forall a. a -> MyMaybe1 a
  pure = coerce (pure @Maybe @a)

  (<*>) :: forall a b. MyMaybe1 (a -> b) -> MyMaybe1 a -> MyMaybe1 b
  (<*>) = coerce ((<*>) @Maybe @a @b)

  -- i.e. current default from `class Applicative`
  a1 *> a2 = (id <$ a1) <*> a2

newtype MyMaybe2 a = MyMaybe2 (Maybe a)
  deriving (Functor, Show)

instance Applicative MyMaybe2 where
  pure :: forall a. a -> MyMaybe2 a
  pure = coerce (pure @Maybe @a)

  (<*>) :: forall a b. MyMaybe2 (a -> b) -> MyMaybe2 a -> MyMaybe2 b
  (<*>) = coerce ((<*>) @Maybe @a @b)

  -- Maybe's override
  --
  -- https://www.stackage.org/haddock/lts-23.10/base-4.19.2.0/src/GHC.Base.html#line-1161
  MyMaybe2 (Just _m1) *> m2 = m2
  MyMaybe2 Nothing  *> _m2 = MyMaybe2 Nothing

times :: Applicative f => Integer -> f a -> f a
times 1 m = m
times n m = m *> times (n - 1) m

lots :: Applicative f => f a -> f a
lots = times 10_000_000

foo1 :: MyMaybe1 ()
foo1 = lots (pure ())

foo2 :: MyMaybe2 ()
foo2 = lots (pure ())

main :: IO ()
main = do
  getArgs >>= \case
    ["default"] -> do
      let !_ = foo1
      pure ()
    ["handwritten"] -> do
      let !_ = foo2
      pure ()

  putStrLn "All fine"
% ghc -v0 -fforce-recomp -rtsopts test28.hs && ./test28 handwritten +RTS -K100m
All fine
% ghc -v0 -fforce-recomp -rtsopts test28.hs && ./test28 default +RTS -K100m    
test28: Stack space overflow: current size 33624 bytes.
test28: Use `+RTS -Ksize -RTS' to increase it.
5 Likes

Thank you for writing it down, this is the same reasoning that @int-index and I had:

BTW, I’ve made a challenge to create a monad that leaks with the default implementation of (>>) and doesn’t leak with a custom one: https://x.com/effectfully/status/1888586726215426335

IIRC I’ve never seen a solution to this challenge apart from the one that I came up with myself: https://x.com/effectfully/status/1890195731970920500

It really is extremely easy to leak space with the default implementation of (*>) and I think it is quite hard to leak space with the default implementation of (>>) for a monad that isn’t completely bogus.

So replacing (*>) with (>>) without changing the default implementation of (*>) is shooting other people in the foot, i.e. bad.

Yes, defining (*>) in terms of (>>=) by default will make a bunch of code not build and will require devs who define non-monad applicatives to implement (*>) manually. I believe, this is desired:

2 Likes

Sounds like I’ll have to amend my amendment to remove the (>>) changes for now at least. Would you (or someone else) be interested in creating a proposal for the changes you suggest here (namely removing the default for (*>)?

1 Like

Longer response when I have more time, but: you may be right about space leaks, but there are other operational metrics on which, for some types, the default implementation of (*>) outperforms default (>>). A silly example:

data M a = M { len :: Int, elems :: [a] } deriving Functor
instance Applicative M where
  pure a = M 1 (pure a)
  M la as <*> M lb bs = M (la * lb) (as <*> bs)
instance Monad M where
  -- probably not the best implementation, but the point is you
  -- can't just multiply
  M la ma >>= k = M s (concat bss)
    where
    (Sum s, bss) = for ma $ \a -> let M lb bs = k a in (Sum lb, bs)
3 Likes

I care about space leaks way more than constant factors. Space leaks break apps, slightly slower (>>) doesn’t. Particularly when the constant factor is increased very insignificantly like in your example.

Incidentally, your example is precisely what I came up with for the leaking-default->> challenge: https://x.com/effectfully/status/1890195731970920500 – hence so far we’ve still only seen a grand total of one counter-example to the general “(>>) is better than (*>)” rule, while there’s a huge body of evidence of the opposite with examples in base and transformers.

2 Likes

Fine, here’s a different one. I guarantee you will notice the performance difference between (*>) and (>>) here if you run it.

data Balance = Balanced Int | Unbalanced deriving (Eq, Show)

addBalance :: Balance -> Balance -> Balance
addBalance (Balanced x) (Balanced y) = Balanced (x + y)
addBalance _ _ = Unbalanced

eqSuccBalance :: Balance -> Balance -> Balance
eqSuccBalance (Balanced x) (Balanced y) | x == y = Balanced (x + 1)
eqSuccBalance _ _ = Unbalanced

data BinTree a = Branch Balance (BinTree a) (BinTree a) | Leaf a deriving (Eq, Functor, Show)

mkBranch :: BinTree a -> BinTree a -> BinTree a
mkBranch l r = Branch (getBalance l `eqSuccBalance` getBalance r) l r

getBalance :: BinTree a -> Balance
getBalance (Leaf _) = Balanced 0
getBalance (Branch b _ _) = b

instance Applicative BinTree where
  pure = Leaf
  Leaf f <*> ma = f <$> ma
  Branch b l r <*> ma = Branch (b `addBalance` getBalance ma) (l <*> ma) (r <*> ma)

instance Monad BinTree where
  Leaf a >>= k = k a
  Branch b l r >>= k = mkBranch (l >>= k) (r >>= k)


bbt :: Int -> BinTree ()
bbt 0 = pure ()
bbt n = let t = bbt (n - 1) in mkBranch t t

main = do
  let t1 = mkBranch (bbt 501) (bbt 500)
  let t2 = bbt 2
  print $ getBalance (t1 *> t2)
  print $ getBalance (t1 >> t2)

There’s a shared thread here. Both this and the list-with-length example have the following features:

  • Values of the monad can contain more than one ‘element’—I don’t know how to formalize this off the top of my head, but it’s a property that the monads in transformers all lack.
  • There is some sort of ‘summary statistic’ semirig-like thing, and a ‘homomorphism’ between the monad and the summary semirig that maps (<*>) to multiplication and (>>=) to n-ary addition.
  • Multiplication is much faster than n-ary addition (even in my first example, it’s not a constant factor as you said but the difference between O(1) and O(n)).

Forgive the gesturing and scare quotes; if I spent more time on this, I could probably devise some laws and find the proper terminology, but I hope you can get the idea without that.

One of the big ideas behind using Applicative as an abstraction distinct from Monad is this ‘static analysis at run-time’ flavor (sounds oxymoronic, but I don’t know how else to describe it). In Applicative, you can do things that depend only on the shapes of your functor values, without forcing computation that depends on their elements. The pattern I describe above is an example of this, and generally speaking, we want both (<*>) and (*>) to take advantage of the gains that ensue.

But for monads that only hold zero or one ‘elements’, and that have fixed shapes or don’t bother themselves with shape-only computation, then it is true that using the productive/‘monadic tail call’ (>>) implementation is better. I also do not want to lose this advantage, for those types. I in fact want to spread that advantage to the definition of (*>), for those types.

From this comment (very helpful detail at the end of this, by the way; thank you!),

it sounds like @tomjaguarpaw is concerned that there are obscure monads out in the wild that have ‘bad’ implementations of (*>) but ‘good’ implementations of (>>)—and here I want to emphasize that ‘bad’ and ‘good’ are relative to whether that monad wants an Applicative-flavored implementation (because it contains multiple elements, etc.) or a Monad-flavored one (because it contains a single element and can benefit from a monadic tail call). In that case, what I want from this proposal is to not preserve this. Those obscure monads need our help and attention! I don’t want a developer encountering such a monad and using (*>) when they ‘should’ be using (>>), because our documentation, our current on-by-default warnings, and much of our folklore states that they are interchangeable up to types. I care about that developer as much as I care about the developer who has to spend a late night tracking down a stack overflow from a ‘bad’ (>>) in someone else’s library.

The ideal end state here is every monad has a definition, in (*>), that is optimal for that type, and there is no confusion for anyone whether (*>) or (>>) is the correct operator to use. Can we align on that ideal? If so, maybe we can brainstorm some new approaches to get there that minimize bad side effects. For example, maybe GHC can make a guess as to whether a particular type should have an Applicative-flavored or Monad-flavored implementation of (*>), and warn if the implementation doesn’t match. Helping me operationalize the difference between the two would be useful there.

However, it is also possible that we can’t align on the ideal, because someone thinks that there exist types for which, in different applications, you might want either the Applicative-flavored behavior or the Monad-flavored behavior. I have been assuming (partly based on the age of the proposal that implies this) that, for a given f, there exists one best implementation of forall a b. f a -> f b -> f b. If this is false, then we need to do the following:

  • immediately remove and backport the warnings telling users to make (*>) and (>>) the same, and
  • update the documentation for (*>) and (>>) to explain why they’re different and in what circumstances users should prefer one or the other.
6 Likes

Great example, thank you. But I think it illustrates my point better than it illustrates yours. Because this isn’t the full list:

You also need:

  • to store the statistic in a way that doesn’t allow you to compute and extract it lazily (if Balance was outside of both Branch and Leaf, the (>>) behavior wouldn’t be this bad given enough ~ annotations)
  • to only access the statistics and not the content
  • to add this statistic specifically for performance purposes and then fail to implement the optimized functions. This is in drastic contrast with Maybe/Either/ReaderT/StateT/etc which exist for their abstraction value, not performance reasons

This is an extremely specific setup.

In fact, I just took a look at the Applicative instance for Either and here it is in entirety without having a definition for *>, i.e. leaking for no good reason if the code is compiled without optimizations (I assume GHC is able to fix the leak with -O1, which is the default, but I didn’t check):

This doesn’t look intentional, since Maybe does have a custom definition for (*>).

If the Haskell ecosystem can’t get the most common and basic applicatives right after multiple decades of their existence, I’m not too worried about someone failing to optimize their very specific library whose whole purpose is optimization – I’m worried about the basic things.

Those obscure monads are Either. As well as ReaderT/StateT/ExceptT in the not so distant past.

We can align on that ideal, but probably not on its achievability.

That sounds like quite a lot of work for not so much benefit. To me (*>) defined in terms of (>>=) still looks like a very clear winner as a default.

I think it’s irrelevant. It’s probably conceivable that there may exist two such operations with the same type, but replacing (>>) with (*>) and getting different behavior is ridiculous as we all seem to agree here, so the author should just commit to these two meaning the same thing.

1 Like

@rhendric I checked system-filepath but it did not have any non-canonical monads. It had a formally non-canonical Monoid with definitions

instance Semigroup
  (<>) = append
instance Monoid
  mappend = append

I turned this into mappend = (<>) but this is purely busy-work.
It is not that there was any non-canonical implementation to start with, it is just that the analysis is too, frankly spoken, dumb.
So maybe some effort could go into implementing smarter analyses, rather than forcing busy-work on maintainers.
This does not seem to be the reflex in the Haskell world. If Haskell had yet stepped up from a programming language to a software engineering language, it would be the reflex.

5 Likes

I love this way of putting it!

I have been following this thread on the side, but it is quite hard to understand the discussion about *> and >>. Can we maybe write this up, and add it to the proposal, so that it is a bit more accessible?

As far as I understand, moving (>>) to top level and defining (>>) = (*>) introduces space leaks in some cases (e.g., Either), but improves performance in some other cases (which ones?). Is this correct?

1 Like