Early feedback: left-biasing `max`

It is only an intuition, it doesn’t have to be correct.
I am just wondering if you intuition is not biased by the fact you encounter this issue from a Foldable point of view.

I guess I’ll never know for sure, because I only learned about max being right-biased in the context of investigating an inconsistency in the Foldable functions. But I was surprised to learn that max favored the opposite argument to min, and I think I would have been equally surprised to learn that in any other context. It just strikes me as an odd choice in the context of software engineering, where it’s generally preferred that similar things behave similarly. I think that maybe some of you all think that an intrinsic part of max-ness is reaching toward the right, where min-ness reaches toward the left—the argument bias is for some reason coupled to the ordering of the values being compared. I see it as an approximation of the ideal min and max—we can’t have symmetric min and max, so we have to compromise, and I would expect that compromise to be the same for both functions.

It’s all kind of moot; as I said, the support for changing this doesn’t seem consistent enough for me to pursue it, so I’m going to pivot toward making the documentation as clear as possible, and possibly campaign for making the Ord documentation stricter about what min and max should be allowed to return (without changing their default implementation).

1 Like

I think the doc and the implementation should be in sync (and ideally with the report).
However, as the doc has been updated well after the code (and the report) it might be interesting to dig out the discussion relative to that “update” of the doc.

Ah, as for that, I have these links:

Here David Feuer introduces the current wording seemingly off the top of his head:

And here’s the MR where the structural equality vs. Eq-equality question is discussed, but no mention of bias direction:

Given the decision to allow Ord instances to return things that are merely Eq-equal to either argument, the documentation is technically correct because there is no bias up to Eq-equality. The documentation merely suggests a left bias. So nothing really went wrong here. However, in order for maximum to be morally a special case of maximumBy, we really would need to make the requirements on max stronger than Eq-equality. I share your discomfort with the current documentation allowing implementations to return some sort of hybrid of the two arguments; that’s very much not what a function called max should do, in my opinion!

I think the merging of those pull requests would have benefited from a better scrutiny, the doc being confusing at best.

I don’t know the original motivation, but I know plenty of cases when min / max returning something different from their arguments is useful.

  1. Quite often the difference between semantical and structural equality is in some sort of annotation with a provenance / position. E. g.,
data Value = Value { getValue :: Int, getProvenance :: Position }

instance Eq Value where 
  (==) = (==) `on` getValue

Now what is the provenance of min x y :: Value? It might be sensible to set Position to <no location>, or maybe to a position of min itself, or maybe combine position ranges somehow. In all these cases min x y is structurally neither x nor y (but semantically equal to one of them).

  1. Another case is that maybe min / max does not evaluate anything, but just construct a correspondent expression. Say,
data Expr 
  = Const Int 
  | Add Expr Expr 
  | Mul Expr Expr 
  | Min Expr Expr 
  | Max Expr Expr

and then simply

min :: Expr -> Expr -> Expr
min = Min
  1. Further, if evaluation of min / max requires some sort of normalisation of arguments, why would not we return the normalised value? For example, it makes a lot of sense for min (Add (Const 1) (Const 2)) (Const 5) to return Const 3.

A variation of above is the following definition of min for ByteString:

min x y
  | Data.ByteString.null x = mempty
  | otherwise = ...

Even while x == mempty, it is not necessarily structurally equal to it (it might be a zero slice of a larger ByteString). And yet this is a sensible definition, useful to free up some over-retained data.

1 Like

My issue with all three of these points is that they’re general arguments against ever having laws. Imagine, for example, extending Expr to include a primitive for semigroup append:

data Expr
 = ...
 | Append Expr Expr

and then

instance Semigroup Expr where
  (<>) = Append

But this instance violates the associativity law of Semigroup. Is that an argument against having the law?

I don’t find this to be a problematic min. The details of how ByteString is implemented internally shouldn’t matter here, as long as the API for ByteString is explicit about which functions peek behind the curtain and which are respectful of the boundary. I’ve been using the phrase ‘observable equality’ rather than ‘structural equality’ to try to highlight this: the essence of the law I would propose is that for any ‘safe’ f :: a -> Bool, let z = f (min x y) in z == f x || z == f y is always true. Hopefully any functions that can distinguish two empty ByteStrings by looking at what memory regions they slice either are documented as unsafe or wrap their results in IO?

I don’t think so. The laws are just required to hold up to some observation function.

@tomjaguarpaw, I don’t understand the point you’re trying to make. Bodigrim’s argument was, it’s convenient not to require Ord instances to obey a stricter version of its laws, because that allows more instances. My counter is that it’s always convenient in that way not to require instances to obey laws, and yet we have laws. I have no idea what your statement means in that context.

Perhaps it would help if I elaborate. There’s a natural tension between having stronger laws (good because you can prove more) and having more instances (good becaues you can use the class in more situations). One way of having many of the benefits of both worlds is to require laws to hold only up to a fairly fine-grained observation function. I see Bodigrim’s points as general arguments in favour of laws holding up to fairly fine-grained observation functions. (Not that I necessarily agree with that, but I think that’s the strongest conclusion one can derive from his points.) However, you said

My issue with all three of these points is that they’re general arguments against ever having laws.

That seems like far too strong a conclusion to me! Perhaps it would help if you explained how you got there from Bodigrim’s points.

EDIT: Or to put it another way, my interpretation of what you’re saying is “You can’t believe Bodigrim’s points yet also believe in ever having laws”. Did I get that right? If so could you please explain? That’s not obvious to me!

Yes, exactly.

Such as (==), in the current iteration of the min/max laws? With you so far, I think, although I’m not aware of other classes making this sort of allowance. When the Semigroup law says that (<>) should be associative, that’s not up to some particular observation function, is it? I default to interpreting such laws as holding up to any pure, safe predicate—when the documentation says x <> (y <> z) = (x <> y) <> z, I read that as meaning that for any safe function f :: a -> Bool, f (x <> (y <> z)) == f ((x <> y) <> z) (where now == can be simple boolean equality). That’s the same standard I’d like to use with min and max.

Is there a practical difference between arguing for weaker laws (such as laws up to some observation function) and arguing for fewer laws? They seem like the same thing to me—a stronger law subsumes a weaker law by adding an additional restriction, which could be seen as a second law in its own right.

What strikes me as more important than the difference between weaker and fewer is the spectrum with strong laws and fewer instances on one end and few/no/weak laws and many instances on the other end.

Bodigrim says that:

That’s the argument for pushing toward the many-instances side of the spectrum—that it’d be useful. That’s what I see as the general argument against laws—more instances are always useful!

He then gives some specific examples of cases that he thinks are particularly useful for min and max. I draw a parallel case to one of those examples with Semigroup, and point out that this would be useful in the same scenario that his example would be useful—by Bodigrim’s reasoning, I would expect him to object to a strict interpretation of Semigroup's laws in that scenario as well.

But I doubt that he does. Because ‘useful’ is not the deciding factor in determining what laws a class should have. Rather, what is most important is the intended meaning of the class abstraction. <> could represent some arbitrary binary operation—but we intend that the meaning of the Semigroup class is to represent, well, a semigroup. And to do that, <> must be associative.

Likewise, if min and max are to represent the abstract concepts of ‘take the lesser/greater of these two arguments’, it would be incorrect for them to (observably—under the hood I don’t care what they do) construct a third value. It might be ‘useful’, but it’s incorrect, and that’s that! If we let usefulness override our sense of what class operations are meant to represent, we would not have laws, because laws only get in the way of having more instances and being more useful.

No, but you didn’t say “My issue with all three of these points is that they’re general arguments in favour of weaker laws” you said “against ever having laws”! That’s a very strong statement. I don’t see how you concluded that. That is my only point of contention.

by Bodigrim’s reasoning, I would expect him to object to a strict interpretation of Semigroup's laws in that scenario as well. But I doubt that he does

I suspect he does! An example that was widely discussed in the Haskell community around ten years ago, as early effect systems were being thrashed out, was Apfelmus’s operational free monad. It allows you to distinguish m from m >>= return, violating right identity. In this debate there were two camps: 1) this is bad because it violates the law, and 2) this is fine because the laws still hold up to reasonable observation functions.

All I’m trying to convey with that hyperbolic-sounding claim is that no matter where you currently are on the having-laws spectrum, someone can always point out that it’d be more useful to have fewer laws. By induction, this is an argument for no laws. I don’t expect anyone to actually follow that inductive argument to its absurd conclusion; I’m just saying that without some counterbalancing force, that argument alone takes you to no-law-town.

That’s an interesting bit of history!

Where’s the camp that this is fine if and only if:

  • the constructors of ProgramViewT are hidden in an internal module
  • any functions that can distinguish between the two sides of the right identity law are also hidden in an internal module, or otherwise labeled as unsafe or improper

Is that what camp 2 meant by reasonable observation functions?

Anyway, I note that despite this, the right identity law is still listed among the laws for Monad, without any specification of the observation functions for which it holds. So perhaps the community consensus is that it’s okay for laws to be strong, and for debatably-non-abiding instances to exist anyway if they’re useful enough—which would support my position.

someone can always point out that it’d be more useful to have fewer laws. By induction, this is an argument for no laws

Sorry, I still don’t think I understand this line of reasoning, but that’s OK. I don’t need to understand it. Hopefully I’ve stated my position clearly enough.

Is that what camp 2 meant by reasonable observation functions?

I don’t think that either the camps or the terms of the debate were well enough defined to give a precise answer to that question

Yes, I think that’s right.

Not just one or a few observation functions, the usual convention is that the laws should hold up to observability through the public interface of the types involved. There should not be any safe public function exposed from the library that is able to tell that the laws do not hold.

That’s also why I would say Arg should be renamed to UnsafeArg as it breaks the Eq extensionality property. I think the only reason it is not explicitly named that currently is because Eq technically has no laws.

It is. A good example is are the Applicative and Monad classes and the Haxl monad, which does not satisfy the law p <*> q = p >>= \f -> q >>= \x -> pure (f x) literally, but it does satisfy that law up to observability through its public interface (although in the presence of I/O you can’t really reason about observational equivalence).

1 Like

Sure, that is a very well-behaved convention, but that’s not the only convention as the example of operational shows.

I don’t see an obvious way that ProgramViewT violates right identity, can you show how that’s done? (In fact, it seems like it is essentially a free(r) monad to me)

Ah, I misremembered. It wasn’t operational that exposed operations that allowed you to obtain the “bind” structure of the program. Perhaps there were some other attempts at libraries that exposed the constructors of the type, or maybe it was a more philosophical argument that “morally” operational violated the monad laws even though “observationally” it didn’t. I wrote an article about an attempt at a free applicative that mentions the matter.

Anyway, I had an insight that maybe explains my confusion. Perhaps @rhendric thinks that some people are saying ‘the documentation for Ord should state “these laws hold only up to some observation function”’ or “we shouldn’t be stricter in the documentation about what min and max can return”. If they are saying that then I object too! (This thread has gone on so long it’s hard to check whether someone said that.) My position is that we should state that the laws hold “on the nose” yet accept conventionally that they will be violated in ways that work out reasonably in practice.

EDIT: Ah, maybe that is indeed what Bodigrim is saying in Early feedback: left-biasing `max` - #76 by Bodigrim. I didn’t interpret it like that, and I don’t think that’s what Bodigrim meant, but I can understand how one could interpret it like that. (FWIW I thought Bodigrim meant “it’s fine to make the documentation more strict but we should also recognise that sometimes we’re going to violate those laws for practical reasons”. I may well be wrong!)

1 Like

I love this examples! I fail to see however, why should this reasoning only apply to Ord, and not, say, Semigroup. It would be similarly useful if all the Semigroup laws were only defined up to an Eq.

Also I think that the current Base documentation, and Arg, and everything that builds on it is against the Haskell report. Also I think that this change is for the worse, and probably you should consider to change back: the only allowed min and max functions should be semantically equivalent to the min/max functions in the Haskell report.

(As it does.)

Yes, that’s what I thought Bodigrim’s position was.

Basically agree, though I think all of Bodigrim’s numbered examples are cases to which I would not be comfortable turning a blind eye, personally. (The unnumbered ByteString example is totally fine.) But as a practical matter, non-abiding instances generally exist and there’s little that stricter folks like myself can do about it except disclaim that things you might expect to be true generally (like maximum = maximumBy compare) may not hold in the presence of such instances.