Why does Haskell not include boxed sum types?

Hello,

I was reading the Non-punning list and tuple syntax proposal, and noticed the Unboxed Sum* datatype in it. Apparently they are even exposed through a pragma.

What I did not get was why boxed sums aren’t? Obviously in many (most?) cases where you want this, you really want to define your own datatype… but a similar argument could be made about Tuples, no? But sometimes elevating the Tuple to its own type doesn’t make sense, I think, and I’d think the same would hold for a Sum.

Is there some technical reason that its not included? Or is demand for it just that low?

*The Sum types are of course older than this proposal, this was just my first time explicitly noticing them.
*Addendum: The names being Sum and Tuple instead of Sum and Product or Tuple and Variant does bother me. But that one feels like its just a me issue :pensive:

2 Likes

I guess Either — nested, if needed — filled the gap.

1 Like

That does sound possible. Its just that nested Eithers look so ugly to me :confounded:

compare

case sum4 of 
    Left x -> ...
    Right (Left x) -> ...
    Right (Right (Left x)) -> ...
    Right (Right (Right x)) -> ...

vs (borrowing unboxed unboxed sum syntax)

case sum4 of 
    ( x | | | ) -> ...
    ( | x | | ) -> ..
    ( | | x | ) -> ..
    ( | | | x ) -> ..

But I guess that pain point isn’t that painful. Especially since its a rare situation that is easily resolved by making your own datatype (probably a good idea anyway). The other pain point is the sense of broken symmetry that hurts my psyche, but that one might just be a me issue.

3 Likes

I bet it was just demand, tracing back all the way to how people are biased towards products. Look at the mathematical culture, tuples are everywhere “in nature”, disjoint unions show up only in niches; and look at the programming culture, records are much more prevalent than variants, even to the point records are used where variants should have been chosen.

All of what follows is from somewhat fuzzy memories of papers I’ve read in the past, so take with salt as I may have some details wrong, but:

In a sense, Haskell’s call-by-need evaluation strategy is itself biased toward products. If f and g are total functions, h x = (f x, g x) is the unique total function such that fst . h = f and snd . h = g; but the equivalent statement for coproducts doesn’t hold even among total functions (the unit type allows two distinct functions, h _ = () and h' = \case { Left _ -> (); Right _ -> () }, where you can distinguish the two by their strictness). It does hold among strict functions, though (and then the original statement for products doesn’t; you can have two implementations that you can distinguish by their totality).