Early feedback: left-biasing `max`

tomjaguarpaw · August 27, 2023, 9:13pm

Oh yes, for some reason I was thinking maximumBy took a a -> a -> a which chooses the maximum!

maxigit · August 27, 2023, 9:20pm

What i understand is that on one side some people (including me and the original haskell report) don’t really mind which the biais is as long as min and max have the opposite so that in effect min a b /= max a b (at pointer equality level) on the other side some might say if things are Eq then the biais is irrelevant (if it matters then you something to break the tie). Both way can be reconciled by saying that we care for min and max only. If things are in a list or any fodable then the output is not predictable .

rhendric · August 27, 2023, 9:30pm

I can absolutely respect that people have different intuitions around the behavior of the binary min and max. But nobody is served by saying that the behavior of minimum and maximum and *By is not predictable. This isn’t an interface where leaving things undefined frees implementers up to make choices that are good for their part of the ecosystem. I’m only calling for change so that things can be more predictable! Who is on the side of unpredictability—who would oppose me by saying no, this shouldn’t change because I would prefer it to be unpredictable? If you like unpredictability so much, change shouldn’t bother you! If you don’t, why object?

nemo · August 27, 2023, 9:57pm

I was today years old when I learned that max was right-biased.

I think no matter if you prefer right-bias or left-bias, having clear documentation about the behavior might be a good idea.

If the reactions to this change are split, you could have a function called maxr or maxl as a potential compromise.

Personally, I think max should remain right-biased, only because switching to left-bias might lead to silent breakage.

rhendric · August 27, 2023, 10:04pm

Would you be equally concerned about silent breaks if we kept max as-is but made Semigroup (Max a), maximum and maximumBy left-biased?

nemo · August 27, 2023, 10:45pm

That’s something I can get behind (:

kaol · August 28, 2023, 5:48am

It’s a property of a sorted list that max on any two elements of it taken in the sorted order returns the right hand one. Making max left leaning would break that property because it wouldn’t respect the order given by sort. I think your way of doing it is going to be less intuitive.

rhendric · August 28, 2023, 5:53am

It’s a property of a reverse-sorted list that max on any two elements of it taken in the reverse-sorted order returns the left-hand one… except it isn’t, because max happens to be right-biased.

It’s also a property of a reverse-sorted list that min on any two elements of it taken in the reverse-sorted order returns the right-hand one… except it isn’t, because min is left-biased.

The existence of types for which equality is not identity fundamentally causes you to surrender approximately half of all of the statements of the above form that you could imagine making. Is there any practical reason to prefer the ones we have over the ones we’d get by making min and max consistent?

maxigit · August 28, 2023, 6:17am

Maybe leaving it unpredictable will force people who do need predictability to use a tie breaker, example fst . maximum . flip zip [1..] instead of relying on implementation potentially savind them from bugs (in case of misbehaving implementation).

Probie · August 28, 2023, 6:39am

Personally, I don’t think it’s clear that left-biasing max is unambiguously the correct thing to do. I’m pretty happy with max returning the element min didn’t, and as far as maximum and minimum go, if you need a property stronger than “the returned element is greater (or less) than or equal to all other elements in the collection” (for example, that it is the first/last such element), you should be using a different function which is explicit about that.

In addition, even if it was an unambiguously beneficial change, changing the existing behaviour may break existing code in a subtle way. It’s not something like removing fail from Monad where the affected code now has compile errors.

bmacho · August 28, 2023, 8:23am

What if changed the description for Ord to stand for total preorders instead of total orders: we allowed a == b = True and a <= b && b <= a = True for distinct a and b? Then the questions like if a sort is stable, or which instance should a function return out of equivalent candidates would be lawful territory instead of unlawful/meaningless.

rhendric · August 28, 2023, 8:30am

There’s no order versus preorder confusion here. Relations represented by Ord do have to satisfy antisymmetry—a >= b && b >= a implies a == b.

The fact that a == b doesn’t imply that a is observationally identical to b isn’t represented in the mathematical relational hierarchy, at least not the one involving traditional orders and preorders (maybe a category theorist has a richer toolkit of relation-like things over gadgets defined only up to equivalence or something).

bmacho · August 28, 2023, 8:41am

I don’t think that for example (<=) = \x y -> True is a total order on Int, at least in the mathematical terminology. Int with (==) = \x y -> True, (<=) = \x y -> True would be a perfectly fine Ord, if not the description of Ord.

I am not sure what was their intent in the Haskell2010 report, I tried to find instances of Ord that are not total orders, but couldn’t find any.

rhendric · August 28, 2023, 8:47am

It definitely is not, given the standard instance of Eq Int. But if you asked me whether, given (==) = \x y -> True, (<=) = \x y -> True defines a total order on Int (or on any other type), I’d have to say yes.

bmacho · August 28, 2023, 9:01am

This confusion appeared in the linked issue #15921: Data.List.maximumBy uses counter-intuitive ordering · Issues · Glasgow Haskell Compiler / GHC · GitLab
although I don’t agree waldmann there, maximumBy (compare 'on' snd) should work regardless if the laws of Ord need total ordering or just a preordering.

rhendric · August 28, 2023, 9:08am

So, it’s correct that compare `on` snd doesn’t generally define a total ordering, given the Eq (a, b) instance likely to be in play (though it does if a is (), for example). But the documentation for maximumBy doesn’t (currently?) state that the comparison function provided needs to define a total ordering, nor should it. As long as the Ord (a, b) instance isn’t defined using compare `on` snd, no foul.

bmacho · August 28, 2023, 11:05am

Oh, right, waldmann most certainly meant that compare 'on' snd equality is not the same as (Int,Int) equality. It seems I am alone with my confusion.

But if you asked me whether, given (==) = \x y -> True, (<=) = \x y -> True defines a total order on Int (or on any other type), I’d have to say yes.

Is there a reason why should Haskell (==) play the role of math =? For me it is more logical for Haskell = to play the role of math =, and for Haskell (==) to play the role of math ==. For example in the case of Arg a b I would say that (==) = is an equivalence relation, and (<=) is a total preorder. That is my mathematical model, and it would be nice to use the same words, with the same meaning. Also nothing else cares about (==) on Arg a b and apart from ordering, everything else treats Arg 0 0 and Arg 0 1 differently, so it is weird to say that Arg 0 0 = Arg 0 1 in math, and antisymmetry holds.

edit: in summary, I think that that antisymmetry law is for (<=) to be a total preorder, and not a total order.

rhendric · August 28, 2023, 6:45pm

I think this is the crux of our disagreement, but there’s a lot of ambiguity smuggled into this statement. For me to go on this journey with you, I’d need you to define what you mean by ‘Haskell =’, ‘math =’, and ‘math ==’, the first two of which I think are overloaded concepts, and the last of which I haven’t seen without some prior definition.

This also might be better as a distinct thread if you want to continue the conversation; I don’t think we’d come back to the question of max, maximum, etc. from that direction.

rhendric · September 1, 2023, 3:21am

I’ve been reflecting on this for a few days and I have to admit, it’s sounding more compelling the more I think about it. Maybe it’s best to consider all of these functions as using argument-order or structure-order as the tie-breaker if Ord-order doesn’t distinguish between elements, instead of thinking about ‘bias direction’ as I have been.

augustss · September 1, 2023, 11:19pm

The reason for the different bias is simple: if you do (min x y, max x y) you know that the pair will contain either (x, y) or (y, x).
Maybe the reason is not spelt out in the report, but it was carefully designed that way.