Will GHC unpack GADT sum types? And are Coercible constraints free?

clinton · January 10, 2025, 2:39pm

Consider something like the following:

type role Silly representational
data Silly a where
  SillyAny :: Silly a
  SillyInt :: (Coercible a Int) => Silly a
  SillyChar :: (Coercible a Char) => Silly a

data Sillier a = Sillier Int {-# UNPACK #-} !(Silly a)

I noticed this gives no warning on the {-# UNPACK #-} pragma (on GHC 9.6.5). Whereas something like:

data Sillier a = Sillier Int {-# UNPACK #-} !a

will warn, saying “Ignoring unusable UNPACK pragma”.

So does this mean:

That GHC can (and/or will) unpack this sum into an unboxed sum style with a discriminator directly in Sillier? (evidenced by the lack of warning) AND
Do the Coercible constraints need to be stored in some form or are they free at runtime? It seems absurd to store them, coerce by it’s very nature does nothing at runtime, so storing the Coercible constraints as a dictionary seems a complete waste of time as every call to coerce will just be removed anyway.

Note: the reason why I don’t do this:

type role Alice representational
data Alice a where
  AliceAny :: Alice a
  AliceInt :: Alice Int
  AliceChar :: Alice Char

or this:

type role Bob representational
data Bob a where
  BobAny :: Bob a
  BobInt :: (a ~ Int) => Bob a
  BobChar :: (a ~ Char) => Bob a

Is because both of these fail to compile because then a isn’t representational. Which makes sense, because something like:

newtype MyInt = MyInt Int

x :: Bob MyInt
x = coerce BobInt

would compile but be wrong.

So I know having the Coercible constraint is not exactly the same as an exact equality constraint, but it’s close enough for my purposes.

adamgundry · January 13, 2025, 8:19am

I was curious about this too, and the easiest way I know to find out is to look at the Core using -ddump-simpl -O1 (and some -dsuppress... options), which gives this on 9.10.1:

$WSillier :: forall a. Int %1 -> Silly a %1 -> Sillier a
$WSillier
  = \ (@a) (conrep :: Int) (conrep1 :: Silly a) ->
      case case conrep1 of {
             SillyAny -> (# _| | #) (##);
             SillyInt unbx -> (# |_| #) unbx;
             SillyChar unbx -> (# | |_ #) unbx
           }
      of unbx
      { __DEFAULT ->
      Sillier conrep unbx
      }

So we see the unpacking does happen: the data con wrapper produces a Sillier constructor with two arguments, namely an Int pointer and an unboxed sum that carries either no information (in the SillyAny case) or a Coercible dictionary (in the other cases). For comparison, here’s the Core for a definition x = Sillier 0 SillyInt:

x1 :: Coercible Int Int
x1 = MkCoercible @~(<Int>_R :: Int ~R# Int)

x2 :: Int
x2 = I# 0#

x :: Sillier Int
x = Sillier x2 ((# |_| #) x1)

This doesn’t show how Coercible dictionaries are represented at runtime, and I don’t know an easy way to illustrate that, but the answer there is that they are represented using “zero-width” fields/arguments. That is, they don’t occupy any space in the data constructor, nor does a (strict) Coercible dictionary require any data to be passed in, so they are almost free at runtime. But it is important that the dictionary is evaluated, rather than being ignored entirely, because an unevaluated dictionary might turn out to be bottom and in that case the program needs to “return bottom” (i.e. loop or throw an exception) rather than potentially segfaulting due to an invalid coercion.

clinton · January 13, 2025, 11:07am

Thanks for your reply! Can you go into detail into this? What is it evaluating if there’s no pointer to the dictionary in the constructor? How could a Coercible dictionary ever be bottom? Can you give an example?

adamgundry · January 13, 2025, 4:44pm

Let me try to explain… Technically, a Coercible a b dictionary is a boxed wrapper around an underlying unboxed zero-width type a ~R b (inexpressible in source Haskell, but present in Core) which you can think of as an actual proof that the representations match. So Coercible a b might in principle be bottom, and needs to be evaluated to make sure it really contains a proof. Whereas a ~R b is unboxed, so it is already guaranteed to have been evaluated, and it doesn’t need to be represented at runtime (because it is inhabited only when the types are genuinely representationally equivalent).

Thus a data constructor can unbox a Coercible a b field and instead “store” the underlying unboxed a ~R b, which occupies no space, and the type system guarantees that the constructor wrapper will have evaluated the dictionary already.

It’s a bit hard to give a real-world example because GHC tries to avoid constructing dictionaries that loop, but the type system of Core doesn’t have a mechanism to prevent loops in general. (It does restrict how coercions, i.e. terms of type a ~R b, can be constructed to prevent loops there.)

For more detail, check out section 6 of Safe Zero-cost Coercions for Haskell, and the predecessor Equality proofs and deferred type errors. (The same mechanism is used both for coercions and for nominal equality proofs, which are used by deferred type errors.)

Topic		Replies	Views
Describing type roles as constraints Show and Tell	8	1597	November 13, 2021
GHC+DH Weekly Update #5, 2023-01-25	4	1964	February 16, 2023
[ANN] data-elevator: Turn any lifted datatype into an unlifted dataype Announcements	8	1191	November 4, 2022
GHC+DH Weekly Update #7, 2023-06-14	1	1087	June 14, 2023
Unboxed equality? Learn	6	639	July 25, 2024

Will GHC unpack GADT sum types? And are Coercible constraints free?

Related topics