Semigroup (Sum Double) associativity tests failed?

I have come across an exercise that asks us to write an instance of Semigroup for some data types and then test the associativity law with QuickCheck. During testing, however, I found that the Sum Double instance always fails the test while other instances succeeded, for example String and Sum Int.

semigroupAssoc :: (Eq m, Semigroup m) => m -> m -> m -> Bool
semigroupAssoc a b c = (a <> (b <> c)) == ((a <> b) <> c)

main :: IO ()
main =
  do 
    quickCheck (semigroupAssoc :: (Sum Double) -> (Sum Double) -> (Sum Double) -> Bool)

When I run the above code, the test always fails, for example with the following message:

*** Failed! Falsified (after 9 tests and 7 shrinks):     
Sum {getSum = 1.0}
Sum {getSum = 3.0}
Sum {getSum = 0.924}

I guess the reason could be related to floating point precision or addition overflow (probably not this one as integer addition tests were fine)? But I cannot be sure about this because after searching the internet, I found nothing related to this. Could someone please expand a bit on this?

Addition isn’t associative for floating point numbers, so there’s no reason to expect this test to pass.

Compare 2**53+(1+1) and (2**53+1)+1 for Doubles

4 Likes

Thank you so much! I see the reason now, but if it’s not associative why would it be an instance of Semigroup as one of its laws asks for associativity of the operation. Does it need to be put in a weaker algebra, like Magma which does not require the binary operation to be associative?

1 Like

…is Semigroup now a derivable typeclass in Haskell?

…@carter may have the answer here (well, one probably better than any of mine).

Not derivable but you get Num a => Semigroup (Sum a) for free.

Alas Num has no laws, just expected properties, so a Semigroup instance on a Num constraint is convenient but has some quirks, which I believe was @nor’s question.

2 Likes

This is an easy question: Sum Double has a Semigroup instance because in situations where code is generalized over an arbitrary Semigroup, it very often makes sense to declare differences in floating point rounding to be beneath your level of abstraction, making Double quite sensible to use in that context. If Sum Double did not have a Semigroup instance, it would be a pain in the @#! to have to work around this all the time.

2 Likes

May I piggyback one highly related question to this thread too since relevant experts are on the line too.

I have always been curious if any implementation of fixed-precision floating points or sometimes [MPFR](Arbitrary-precision arithmetic - Wikipedia (I have found two hackage packages fixed-precision & fixedprec) can ever be generally associative and commutative. Even if it means including some well-defined error-bound in the data, as long as its “equality operation” uses it.

1 Like

Btw. you can derive Semigroup and Monoid via Generically a if your datatype a has all semigroup/monoid fields.

1 Like

So I like to think of arithmetic equations in floating point as being true up to multiplications of the form 1+/- epsilon, where epsilon is a number corresponding to the least significant digits of the numbers at hand . The usual laws for math operations work if you add these little multiplications factors into your equalities.

Another way to think of it is: floating point numbers don’t represent a point, but an interval! So equality needs to be with respect to an interval that’s been suitably transformed by these little factors.

this can be realized by either traditional interval arithmetic plus explicit rounding modes for the calculations, or perhaps more elegantly by its cousin Ball arithmetic, where instead you have a point and a radius. In both interval and ball arithmetic, equality already has to grapple with questions of what does it mean for intervals to be equal, and you can decide what that means. I think you can make the case that Eq is too weak an interface for floating point, because we can’t pass in some epsilon parameter that lets us equate near enough numbers.

Eq and ord on floating point are still useful, but a semantic eq for floating point needs to use ORD and a close enough parameter that might be application dependent.

I should maybe play with this idea more. Sounds fun. A more pithy version of what I just said is “floating point is a log scale geometry approximation of the real number line with finite representation”

5 Likes

Only tangentially related, but I learned the other day that Double has an Enum instance, where succ (x :: Double) == x + 1. I felt many emotions after learning that piece of information :stuck_out_tongue:

2 Likes

I can already see the meme image macro for this one in my head :stuck_out_tongue:

oh yeah that made me sad the day i saw it

Perhaps in the absence of that specific Enum instance for Double:

That being the case, suggestions on what ought to have been done back in 1990 can be left here for future reference: