`Monoid` of no `mappend`: Call for examples

I’m continuing my work on the monad of no return stuff after some discussion with the CLC.

A point of contention is that of a theoretical more powerful set of Monoids. Namely, as I am trying to define, definitively, without exception, that mappend = (<>), are there any Monoids (in existing especially packages or that come to mind now) that are missing out?

Here is a contrived example of the kind of thing we are looking for:

data X a = S a | V [a]
 deriving Eq

instance Semigroup a => Semigroup (X a) where
  S a  <> S b  = S $ a <> b
  S a  <> V bs = V $ fmap (a <>) bs
  V as <> S b  = V $ fmap (<> b) as
  V as <> V bs = V $ zipWith (<>) as bs

instance (Monoid a, Eq a) => Monoid (X a) where
  mempty = S mempty
  mappend a b
    | a == mempty = b
    | b == mempty = b
    | otherwise   = a <> b

We’re looking for similar sorts of Monoids in the wild that people would actually use.

My point of view is that most wouldn’t choose to use the mempty value in mappend, and that in any concrete/non-generic implementation that they could choose to do so in (<>) if they really wanted to. In the above example, I’d also argue that creating an eqMappend would make more sense than having this as a mappend operator, so that Monoid (X a) has as minimal constraints as possible.

13 Likes

I would like to echo this comment by rhendric and say that, for the sanity of the ecosystem, let us not encourage such instances even if they exist in the wild. Having to actively decide between using (<>) and mappend every time is not a burden I want to bear.

For this particular example, one can move the Eq a => and the mappend implementation to the Semigroup instance. If an Eq-less X is desirable it can be a separate type with the current Semigroup instance and a Monoid instance with default mappend.

13 Likes

i am fully in favor of making a proposal for this. even if you have a reason for having a (<>) different to mappend, probably means you have a type that has multiple reasonable different Semigroup implementations, and the best alternative is to implement each (<>) in a separate newtype.

4 Likes

I don’t quite understand what the example is showing. Is it supposed to be a pair of Semigroup, Monoid instances on the same type where mappend is not the same function as (<>)? I think we already have a few of those in base, and these are usually disambiguated via newtype.
For example, For any semigroup s the type Maybe s has semigroup and monoid instances that agree on the binary operator, but since Maybe is also an Alternative there is a distinct mappend operation via Alt Maybe s. My first reflex for your example would be to use a similar newtype in the Monoid (X a) instance.

However, using equational reasoning, if mappend is identical to (<>) for the underlying monoid a, then

S mempty  <> S b  = S $ mempty <> b = S b
S mempty  <> V bs = V $ fmap (mempty <>) bs = V $ fmap id bs = V bs
V as <> S mempty  = V $ fmap (<> mempty) as = V $ fmap id as = V as

showing that the two binary operations are actually not different. Or is the example only illustrating that the two identical binary operations could be defined differently and therefore the instance declarations be broken by removing mappend from the Monoid class?

The example is meant to show a mappend that while equivalent in outcome is not equivalent in implementation. My ask here is are there many (any?) such mappend implementations which cannot usefully be defined for Semigroup alone, and require Monoid to implement something like mappend? Indeed, using a newtype one could do the following:

-- for short circuit on eq
newtype ShortEq a = MkShortEq a
  deriving Eq
  deriving newtype Monoid -- assuming `mappend` doesn't cause trouble
instance (Eq a, Monoid a) => Semigroup (ShortEq a) where
  (<>) a b
    | a == mempty = b
    | b == mempty = a
    | otherwise = (coerce @(a -> a -> a) (<>)) a b

And then you can “easily” use Eq to shortcircuit any Monoid.

I’m having a hard time following this argument.

If you don’t care about the operational difference (which you apparently don’t), then you just use (<>) and be done with it.

1 Like

Sometimes you don’t have a choice though. Like if you use Foldable1, it’s going to use <> and cannot have access to a differently-defined mappend.

This smells like ap-bind consistency all over again to me.

4 Likes

Which it literally is, given things like Data.Monoid.Ap and instance (Monad m, Monoid a) => Monoid (IO a). Was there a consensus about ap-bind consistency in the AMP?
The Foldable class contains many redundant methods precisely because specialized optimizations are often possible. This would be a data point in favour of keeping mappend as a (redundant) class method.

I’m trying to find a mappend implementation that justifies having mappend separately. So far I have not. The Eq optimization above can easily be recovered by using my ShortEq suggestion, but what other examples exist in the wild? I’ve yet to see any justifiable reason for separate mappend.

For the record I’d be in favour of drastically reducing the traverse, mapm etc foldables and traversables, but those have other issues currently out of scope.

3 Likes

if you mean making Monoid a superclass of Semigroup, it means that you can’t have Semigroup (NonEmpty a) for example (unless you do some kind of zipping monoid which is usually not wanted).

2 Likes

Here is how I would read the following:

class A x => B x where
  foo :: x

instance C z => B (Y z) where
  foo = undefined

For every instance of B, A must also hold for that type. For Y z’s instance of B, C must additionally hold for z.

To get back on track, currently the default implementation of mappend = (<>) is fine. But I’m trying to remove the ability to override that, as I think there is no case where it is a good idea to be able to provide that override.

1 Like

It does introduce a cyclic dependency as well as an odd instance that will hurt type errors I think. If this instance is in scope and someone tries to use <> on a type that does not have Semigroup, instead of being told they don’t have a semigroup insurance they’ll be told they don’t have a monoid instance.

Maybe we should try to solve the abstract question first: is it ok to have additional class methods that only differ operationally from others?

Because this will come up every time someone suggests a “clean up”.

8 Likes

I think we should avoid them. if you have a concrete type you can easily use more constrained methods with no problem and if we have a need for a polymorphic generic implementation we can easily just create a

newtype ConstrainedPerformant a = MkConPer a
instance (C a) => D (ConstrainedPerformant a) where
  ...

i am unable to imagine a case where both fail but please feel free to enlighten me.

2 Likes

For this situation, the key is “can differ operationally”. As discussed elsewhere there are some methods which demonstrably do differ operationally (and we’ve yet to prove that there is a good way to unify them), but in this case the difference in operation is, in every case bar the contrived one above, negligible, hence why we’re trying to find any implementation of substance.

To answer your question more directly: we should clean them up unless given good reason why they cannot be unified or made to operate identically.

I think one should not burden every user of every Data.Foo module to read the documentation of every Foldable instance and scan for sentences like toList is inefficient for this data type, use this special implementation instead!

Similarly users should not be burdened to understand the difference in how <> and mappend work for each type, and each type should either have a single way to combine using these methods, or provide their own binary combinators.

I’m not sure what gave that impression, I do care about operational differences. I agree with the rest of your statement.

Just to be clear: there are additional class methods such as null on Foldable, which are redundant with other methods on the same class (foldr) but give instance authors the ability to provide operationally better implementations. I don’t think you’re talking about those, and I don’t think anyone is seriously challenging those.

There are also class methods on subclasses that are redundant with other methods on their superclasses, but may differ operationally, and that’s what we’re trying to clean up.

With respect to that latter category, I think answering your abstract question is precisely what we should be doing. We’ve yet to find a non-hypothetical example of such a thing that couldn’t be refactored by putting the operationally-best implementation in the superclass. That suggests to me that the potential upside of endorsing that kind of redundant method, as an abstract principle, is limited.

The potential downside is not limited. If we justify mappend by the argument that a Monoid instance might be able to take advantage of a more specific context (in the running example, a context with Eq in it) to provide a better mappend implementation, that justification applies equally to any class that is a subclass of another. We should slap an ordEq :: a -> a -> Bool method onto Ord, because an Ord instance might have a context that enables a better implementation than (==). We should give Fractional copies of all the Num operators, because a Fractional instance might etc. etc.

Maybe there is another, better justification for keeping mappend (or (>>)) around. But this argument alone (that there could be an operationally better implementation that can’t be moved to (<>)) should not suffice, else we open the gates to a very silly place.

9 Likes

I completely agree!

If it is feasible to write newtypes with the specialized, faster implementations of (<>), (>>), …, of the base type, we should go that way.

1 Like