It is only an intuition, it doesn’t have to be correct.
I am just wondering if you intuition is not biased by the fact you encounter this issue from a Foldable point of view.
I guess I’ll never know for sure, because I only learned about max
being right-biased in the context of investigating an inconsistency in the Foldable
functions. But I was surprised to learn that max
favored the opposite argument to min
, and I think I would have been equally surprised to learn that in any other context. It just strikes me as an odd choice in the context of software engineering, where it’s generally preferred that similar things behave similarly. I think that maybe some of you all think that an intrinsic part of max
-ness is reaching toward the right, where min
-ness reaches toward the left—the argument bias is for some reason coupled to the ordering of the values being compared. I see it as an approximation of the ideal min
and max
—we can’t have symmetric min
and max
, so we have to compromise, and I would expect that compromise to be the same for both functions.
It’s all kind of moot; as I said, the support for changing this doesn’t seem consistent enough for me to pursue it, so I’m going to pivot toward making the documentation as clear as possible, and possibly campaign for making the Ord
documentation stricter about what min
and max
should be allowed to return (without changing their default implementation).
I think the doc and the implementation should be in sync (and ideally with the report).
However, as the doc has been updated well after the code (and the report) it might be interesting to dig out the discussion relative to that “update” of the doc.
Ah, as for that, I have these links:
Here David Feuer introduces the current wording seemingly off the top of his head:
And here’s the MR where the structural equality vs. Eq
-equality question is discussed, but no mention of bias direction:
Given the decision to allow Ord
instances to return things that are merely Eq
-equal to either argument, the documentation is technically correct because there is no bias up to Eq
-equality. The documentation merely suggests a left bias. So nothing really went wrong here. However, in order for maximum
to be morally a special case of maximumBy
, we really would need to make the requirements on max
stronger than Eq
-equality. I share your discomfort with the current documentation allowing implementations to return some sort of hybrid of the two arguments; that’s very much not what a function called max
should do, in my opinion!
I think the merging of those pull requests would have benefited from a better scrutiny, the doc being confusing at best.
I don’t know the original motivation, but I know plenty of cases when min
/ max
returning something different from their arguments is useful.
- Quite often the difference between semantical and structural equality is in some sort of annotation with a provenance / position. E. g.,
data Value = Value { getValue :: Int, getProvenance :: Position }
instance Eq Value where
(==) = (==) `on` getValue
Now what is the provenance of min x y :: Value
? It might be sensible to set Position
to <no location>
, or maybe to a position of min
itself, or maybe combine position ranges somehow. In all these cases min x y
is structurally neither x
nor y
(but semantically equal to one of them).
- Another case is that maybe
min
/max
does not evaluate anything, but just construct a correspondent expression. Say,
data Expr
= Const Int
| Add Expr Expr
| Mul Expr Expr
| Min Expr Expr
| Max Expr Expr
and then simply
min :: Expr -> Expr -> Expr
min = Min
- Further, if evaluation of
min
/max
requires some sort of normalisation of arguments, why would not we return the normalised value? For example, it makes a lot of sense formin (Add (Const 1) (Const 2)) (Const 5)
to returnConst 3
.
A variation of above is the following definition of min
for ByteString
:
min x y
| Data.ByteString.null x = mempty
| otherwise = ...
Even while x == mempty
, it is not necessarily structurally equal to it (it might be a zero slice of a larger ByteString
). And yet this is a sensible definition, useful to free up some over-retained data.
My issue with all three of these points is that they’re general arguments against ever having laws. Imagine, for example, extending Expr
to include a primitive for semigroup append:
data Expr
= ...
| Append Expr Expr
and then
instance Semigroup Expr where
(<>) = Append
But this instance violates the associativity law of Semigroup
. Is that an argument against having the law?
I don’t find this to be a problematic min
. The details of how ByteString
is implemented internally shouldn’t matter here, as long as the API for ByteString
is explicit about which functions peek behind the curtain and which are respectful of the boundary. I’ve been using the phrase ‘observable equality’ rather than ‘structural equality’ to try to highlight this: the essence of the law I would propose is that for any ‘safe’ f :: a -> Bool
, let z = f (min x y) in z == f x || z == f y
is always true. Hopefully any functions that can distinguish two empty ByteString
s by looking at what memory regions they slice either are documented as unsafe or wrap their results in IO
?
I don’t think so. The laws are just required to hold up to some observation function.
@tomjaguarpaw, I don’t understand the point you’re trying to make. Bodigrim’s argument was, it’s convenient not to require Ord
instances to obey a stricter version of its laws, because that allows more instances. My counter is that it’s always convenient in that way not to require instances to obey laws, and yet we have laws. I have no idea what your statement means in that context.
Perhaps it would help if I elaborate. There’s a natural tension between having stronger laws (good because you can prove more) and having more instances (good becaues you can use the class in more situations). One way of having many of the benefits of both worlds is to require laws to hold only up to a fairly fine-grained observation function. I see Bodigrim’s points as general arguments in favour of laws holding up to fairly fine-grained observation functions. (Not that I necessarily agree with that, but I think that’s the strongest conclusion one can derive from his points.) However, you said
My issue with all three of these points is that they’re general arguments against ever having laws.
That seems like far too strong a conclusion to me! Perhaps it would help if you explained how you got there from Bodigrim’s points.
EDIT: Or to put it another way, my interpretation of what you’re saying is “You can’t believe Bodigrim’s points yet also believe in ever having laws”. Did I get that right? If so could you please explain? That’s not obvious to me!
Yes, exactly.
Such as (==)
, in the current iteration of the min
/max
laws? With you so far, I think, although I’m not aware of other classes making this sort of allowance. When the Semigroup
law says that (<>)
should be associative, that’s not up to some particular observation function, is it? I default to interpreting such laws as holding up to any pure, safe predicate—when the documentation says x <> (y <> z) = (x <> y) <> z
, I read that as meaning that for any safe function f :: a -> Bool
, f (x <> (y <> z)) == f ((x <> y) <> z)
(where now ==
can be simple boolean equality). That’s the same standard I’d like to use with min
and max
.
Is there a practical difference between arguing for weaker laws (such as laws up to some observation function) and arguing for fewer laws? They seem like the same thing to me—a stronger law subsumes a weaker law by adding an additional restriction, which could be seen as a second law in its own right.
What strikes me as more important than the difference between weaker and fewer is the spectrum with strong laws and fewer instances on one end and few/no/weak laws and many instances on the other end.
Bodigrim says that:
That’s the argument for pushing toward the many-instances side of the spectrum—that it’d be useful. That’s what I see as the general argument against laws—more instances are always useful!
He then gives some specific examples of cases that he thinks are particularly useful for min
and max
. I draw a parallel case to one of those examples with Semigroup
, and point out that this would be useful in the same scenario that his example would be useful—by Bodigrim’s reasoning, I would expect him to object to a strict interpretation of Semigroup
's laws in that scenario as well.
But I doubt that he does. Because ‘useful’ is not the deciding factor in determining what laws a class should have. Rather, what is most important is the intended meaning of the class abstraction. <>
could represent some arbitrary binary operation—but we intend that the meaning of the Semigroup
class is to represent, well, a semigroup. And to do that, <>
must be associative.
Likewise, if min
and max
are to represent the abstract concepts of ‘take the lesser/greater of these two arguments’, it would be incorrect for them to (observably—under the hood I don’t care what they do) construct a third value. It might be ‘useful’, but it’s incorrect, and that’s that! If we let usefulness override our sense of what class operations are meant to represent, we would not have laws, because laws only get in the way of having more instances and being more useful.
No, but you didn’t say “My issue with all three of these points is that they’re general arguments in favour of weaker laws” you said “against ever having laws”! That’s a very strong statement. I don’t see how you concluded that. That is my only point of contention.
by Bodigrim’s reasoning, I would expect him to object to a strict interpretation of
Semigroup
's laws in that scenario as well. But I doubt that he does
I suspect he does! An example that was widely discussed in the Haskell community around ten years ago, as early effect systems were being thrashed out, was Apfelmus’s operational free monad. It allows you to distinguish m
from m >>= return
, violating right identity. In this debate there were two camps: 1) this is bad because it violates the law, and 2) this is fine because the laws still hold up to reasonable observation functions.
All I’m trying to convey with that hyperbolic-sounding claim is that no matter where you currently are on the having-laws spectrum, someone can always point out that it’d be more useful to have fewer laws. By induction, this is an argument for no laws. I don’t expect anyone to actually follow that inductive argument to its absurd conclusion; I’m just saying that without some counterbalancing force, that argument alone takes you to no-law-town.
That’s an interesting bit of history!
Where’s the camp that this is fine if and only if:
- the constructors of
ProgramViewT
are hidden in an internal module - any functions that can distinguish between the two sides of the right identity law are also hidden in an internal module, or otherwise labeled as unsafe or improper
Is that what camp 2 meant by reasonable observation functions?
Anyway, I note that despite this, the right identity law is still listed among the laws for Monad
, without any specification of the observation functions for which it holds. So perhaps the community consensus is that it’s okay for laws to be strong, and for debatably-non-abiding instances to exist anyway if they’re useful enough—which would support my position.
someone can always point out that it’d be more useful to have fewer laws. By induction, this is an argument for no laws
Sorry, I still don’t think I understand this line of reasoning, but that’s OK. I don’t need to understand it. Hopefully I’ve stated my position clearly enough.
Is that what camp 2 meant by reasonable observation functions?
I don’t think that either the camps or the terms of the debate were well enough defined to give a precise answer to that question
Yes, I think that’s right.
Not just one or a few observation functions, the usual convention is that the laws should hold up to observability through the public interface of the types involved. There should not be any safe public function exposed from the library that is able to tell that the laws do not hold.
That’s also why I would say Arg
should be renamed to UnsafeArg
as it breaks the Eq
extensionality property. I think the only reason it is not explicitly named that currently is because Eq
technically has no laws.
It is. A good example is are the Applicative and Monad classes and the Haxl monad, which does not satisfy the law p <*> q = p >>= \f -> q >>= \x -> pure (f x)
literally, but it does satisfy that law up to observability through its public interface (although in the presence of I/O you can’t really reason about observational equivalence).
Sure, that is a very well-behaved convention, but that’s not the only convention as the example of operational
shows.
I don’t see an obvious way that ProgramViewT violates right identity, can you show how that’s done? (In fact, it seems like it is essentially a free(r) monad to me)
Ah, I misremembered. It wasn’t operational
that exposed operations that allowed you to obtain the “bind” structure of the program. Perhaps there were some other attempts at libraries that exposed the constructors of the type, or maybe it was a more philosophical argument that “morally” operational
violated the monad laws even though “observationally” it didn’t. I wrote an article about an attempt at a free applicative that mentions the matter.
Anyway, I had an insight that maybe explains my confusion. Perhaps @rhendric thinks that some people are saying ‘the documentation for Ord
should state “these laws hold only up to some observation function”’ or “we shouldn’t be stricter in the documentation about what min
and max
can return”. If they are saying that then I object too! (This thread has gone on so long it’s hard to check whether someone said that.) My position is that we should state that the laws hold “on the nose” yet accept conventionally that they will be violated in ways that work out reasonably in practice.
EDIT: Ah, maybe that is indeed what Bodigrim is saying in Early feedback: left-biasing `max` - #76 by Bodigrim. I didn’t interpret it like that, and I don’t think that’s what Bodigrim meant, but I can understand how one could interpret it like that. (FWIW I thought Bodigrim meant “it’s fine to make the documentation more strict but we should also recognise that sometimes we’re going to violate those laws for practical reasons”. I may well be wrong!)
I love this examples! I fail to see however, why should this reasoning only apply to Ord, and not, say, Semigroup. It would be similarly useful if all the Semigroup laws were only defined up to an Eq.
Also I think that the current Base documentation, and Arg, and everything that builds on it is against the Haskell report. Also I think that this change is for the worse, and probably you should consider to change back: the only allowed min and max functions should be semantically equivalent to the min/max functions in the Haskell report.
(As it does.)
Yes, that’s what I thought Bodigrim’s position was.
Basically agree, though I think all of Bodigrim’s numbered examples are cases to which I would not be comfortable turning a blind eye, personally. (The unnumbered ByteString
example is totally fine.) But as a practical matter, non-abiding instances generally exist and there’s little that stricter folks like myself can do about it except disclaim that things you might expect to be true generally (like maximum = maximumBy compare
) may not hold in the presence of such instances.