The Current Coercible Rules are Dangerously Misleading

While the addition of coercible is wonderful, some quirks of it are I argue dangerously misleading.

The documentation says

The function coerce allows you to safely convert between values of types that have the same representation with no run-time overhead. In the simplest case you can use it instead of a newtype constructor, to go from the newtype’s concrete type to the abstract type. But it also works in more complicated settings, e.g. converting a list of newtypes to a list of concrete types.

Coercible is a two-parameter class that has instances for types a and b if the compiler can infer that they have the same representation. This class does not have regular instances; instead they are created on-the-fly during type-checking. Trying to manually declare an instance of Coercible is an error.

And at the very bottom

For more details about this feature, please refer to Safe Coercions by Joachim Breitner, Richard A. Eisenberg, Simon Peyton Jones and Stephanie Weirich.

One might intuitively expect this to be a way for more safely doing what otherwise one might use unsafeCoerce for. And I do believe even the paper hints at this.

However, the details of the rules make this naive assumption unsound.

Suppose an author had previously been doing the following.

data Data a b = MkData (Maybe a) !b
    deriving Show

castList :: [Data a Void] -> [Void]
castList = unsafeCoerce

castMaybeList :: [Maybe (Data a Void)] -> [Maybe Void]
castMaybeList = unsafeCoerce

revCastList :: [Void] -> [Data a Void]
revCastList = unsafeCoerce

revCastMaybeList :: [Maybe Void] -> [Maybe (Data a Void)]
revCastMaybeList = unsafeCoerce

The author now reads about this new ‘Coercible’ mechanism, and thinks that it can dramatically improve their situation.
So they try to encapsolate all they were previously doing by hand with this:

unsafeCoercion :: forall a b. Coercion a b
unsafeCoercion = unsafeCoerce (Coercion :: Coercion a a)

voidCoercion :: Coercion (Data a Void) Void
voidCoercion = unsafeCoercion

Unfortunately and surprisingly, they have now made their code unsound! Coercion a b does not merely assert that the two types have the same representation. That holds true here, the only sound terms of either type are bottoms. It asserts more. Buried in the rules of the paper, for datatypes one can transform the coercion into a coercion of its type variables.

As a result, the following code goes through:

bogusCoerce :: forall a b. a -> b
bogusCoerce = (case c of Coercion -> coerce) where
    c :: Coercion a b
    c = extract mid
    extract :: forall a' b' c'. Coercion (Data a' c') (Data b' c') -> Coercion a' b'
    extract Coercion = Coercion
    mid :: Coercion (Data a Void) (Data b Void)
    mid = combine voidCoercion voidCoercion
    combine :: forall a' b' c'. Coercion a' c' -> Coercion b' c' -> Coercion a' b'
    combine Coercion Coercion = Coercion

While the poor author might have thought their use of unsafeCoerce was nicely encapsulated by their voidCoercion, they would be quite incorrect due to the fact that extract is a legal function!

Worse, the fact that Coercible is a typeclass encourages one to have things automatically solved. One might try to be clever, and create the following:

class (forall a. Coercible (Data a Void) Void) => CoercionContext where
class (forall (a :: Type). Coercible a a) => CoercionContextDummy where
instance CoercionContextDummy

data Dict (c :: Constraint) = c => Dict

coercionContext :: Dict CoercionContext
coercionContext = unsafeCoerce dummy where
    dummy :: Dict CoercionContextDummy
    dummy = Dict

Now, currently GHC does not get very clever about using Coercible constraints in the context, especially quantified ones. However, this is hardly a feature, and actually quite annoying in practice. It would hardly be surprising nor should it be undesirable for GHC to get cleverer about this!

But its limited cleverness is the only reason the following does not compile yet!

bogusCoerce :: a -> b
bogusCoerce = (case coercionContext of Dict -> coerce)

Now, obviously one would notice something as absurd as that.
But suppose one did something more like

newtype Age = MkAge Int
newtype Bar a b = MkBar (Foo b a)
newtype ZipList a = MkZipList [a]
foo :: [Foo Int (Data a Void)] -> ZipList (Bar Void Age)
foo = (case coercionContext of Dict -> coerce)

Perfectly reasonable code. Throw in a traverse with that and it wouldn’t be surprising at all.
Maybe it doesn’t compile today due to lack of cleverness in solving Coercible instances, but maybe someone put in the elbow grease to fancy it up.
And then one day in some other portion of code, Bar is changed for some reason.
Not to worry though, this code should stop compiling then! Except… not if the compiler is clever enough to instead go through the argument used in the first bogusCoerce

In conclusion: Accidental unsoundness is bad, and the current design makes it an easy trap to fall into. Worse, it isn’t that hard to move the unsoundness up into the implicit code generated by constraint solving, and so as constraint solving gets cleverer the risk of the compiler exploiting that unsoundness without any visible code showing it doing so increases.

I don’t understand this. Coercion is not a datatype in the Coercible module. It is something that this code appears to introduce and define? And it is done with unsafeCoerce. So yes, this code is unsafe. But I do not think that it reflects on the code available in the module.

The other code you have provided also makes use of unsafeCoerce. So I think the answer is: don’t do that. Using unsafeCoerce makes code unsafe. Instead, just use Coercible which the compiler automatically derives safely. And if the code cannot use that, then the compiler should not be outsmarted.

4 Likes

The poor author should not think that – that’s ultimately the resolution to this issue. The poor author should, as @sclv says, only use Coercible instances (and thus Coercions) that are derived by compiler. (There is a degree of control over which instances are derived, provided by type role annotations.)

But, you’re saying that the poor author has good reason to think that. So let’s look at the root of the issue. GHC automatically gives us extract:

extract ::
  Coercible (Data a b) (Data a' b) =>
  Dict (Coercible a a')
extract = Dict

(I’d rather use Dict than Coercion, because it’s more generic.) And the poor user wants to define

unsafeCoercion :: forall a b. Dict (Coercible a b)
unsafeCoercion = unsafeCoerce (Dict :: Dict (Coercible a a))

voidCoercion :: Dict (Coercible (Data a Void) Void)
voidCoercion = unsafeDict

You observe that unsoundness results from the coexistence of extract and voidCoercion.

I can see two problems with the documentation of Coercible. Firstly, it doesn’t explain that instances like extract exist. Secondly, it is vague about what “the same representation” means. Clearly it’s not enough for “both to always be bottom”, because that would make extract unsound.

So, I don’t think the rules are dangerous and misleading, but I do think the documentation is misleading. I would like to see the documentation tidied up. I filed an issue: #25392: Documentation for Coercible is incomplete and vague · Issues · Glasgow Haskell Compiler / GHC · GitLab.

It is defined in Data.Type.Coercion.

6 Likes

Safe modules have no obligation to consider (let alone document) their interaction with unsafe functions. The documentation of unsafeCoerce, however, could always do to offer more guidance.

1 Like

So, I would dispute that this is ‘just’ a documentation issue. I do agree it is proven that the system described in the paper is sound. However, I think the presence of extract is actually problematic from a design point of view.

Note it is possible to suppress this. You could define Data as a newtype around a Any, write DataPrim which is the actual definition, use pattern synonyms and unsafe coerce to expose the fact that Data has the invariant of actually always containing a valid DataPrim value, and then make things work safely while getting the powerful abstraction ability of voidCoercion.

But it is actually really annoying and dubious design I think that it is so difficult to use the coerce mechanism to abstract over all those cast functions defined initially.

As I understand the only way to do such is to either do the boilerplate to suppress the extract by using newtypes, role annotations, and a pile of unsafe coerces, or create an alternative typeclass representing a restricted version of Coercible a b => Coercible (f a) (f b) which is limited to basically just excluding things like Coercion but permits, well, everything else.

It is also an obnoxious abstraction leak. Outside the module it shouldn’t be so exposed if I did

data DataInner a b = MkDataInner (Maybe a) !b

newtype Data a b = WrapAny (Any)
type role Data representational representational

pattern MkData :: forall a b. Maybe a -> b -> Data a b
pattern MkData a b <- (unsafeCoerce -> MkDataInner a b) where
    MkData a b = unsafeCoerce (MkDataInner a b)

or

data Data a b = MkData (Maybe a) !b

There are some cases where such inference is justified. I think Coercible [a] [b] does logically imply Coercible a b regardless of if the inference engine infers it.

The way I’d be inclined to formalize the whole thing would be that in a dependently typed language with erased propositions we could have something like coerce :: forall a b. foreach (x : a). (exists# (y : b). SAMEREP# x y) -> b, where exists# is a strict erased total proposition and SAMEREP# would be the assertion that x and y have the same representation at runtime. Then a ~R# b would be (# foreach# (x : a), exists# (y : b). SAMEREP# x y, foreach# (x : b), exists# (y : a). SAMEREP# x y #).

Thus Coercible [a] [b] => Coercible a b could be justified by going through \x -> head (coerce [x]).

However, as soon as the data type is abstract or has strictness notations on it, things become a lot more complicated, and I think the inference engine should be conservative, regardless of if it is made clever enough to do the justified inference Coercible [a] [b] => Coercible a b.

The above mentioned principle about not being able to distinguish suggests we actually should in some cases be allowed to write our own Coercible instances or similar if Coercible [a] [b] => Coercible a b is automatically exposed, because otherwise it fails to be preserved if I replace data MyList a = MyNil | MyCons a (MyList a) with the version using unsafeCoerce.

More generally I think the handling of data types in the paper, while clever, was unprincipled and mostly unnecessary. There are actually quite a few cases where it comes out annoying compared to the elegant handling of newtypes.

For example, newtype Foo a b = MkFoo (MyTypeFamily a, b) has much nicer behavior than data Foo a b = MkFoo (MyTypeFamily a) b, and on multiple occasions I have had to basically convert my data types to that form to make them work better with the Coercible framework.

I don’t think it is a coincidence that newtypes around sums and products work so much better for Coercible and that they disable the extract rule. data Foo would have it be unsound to have Coercible (Foo Bool Bool) (Foo Int Bool) even in the case where Coercible (MyTypeFamily Bool) (MyTypeFamily Int) because of that extract rule, but by disabling that newtype Foo not only permits it to be sound but has such instances automatically appear.

The datatype rules for Coercible I claim only work ‘nicely’/‘properly’ for sums and products that use the type variables in rather specific ways. I’m not sure exactly what the rules for them are. Any direct use of the variable without a strictness annotation is fine. I think things like Foo a = Nil | Cons a (Foo (a, a)) work out fine.

But once you have strictness annotations or use of type families, the most general sound Coercible instances one can make involve jumping through convoluted hoops, if they can be represented at all. Admittedly representing some of them would require dramatic extensions. For example, Coercion (Data a c) (Data b c) -> c -> Coercion a b is sound as long as it is strict in both arguments, but actually teaching the compiler to check that let alone do it automatically in the typeclass contexts would be… well, perhaps not impossible, but it would certainly be… shall we say ‘ambitious’ to make the presence of a case on an arbitrary value of type c in the context change constraint solving like that. I suppose one could make it so that there is a constraint Inhabited# a that is automatically added to the context as result of such a case statement and then have Data a c ~R# Data b c -> Inhabited# c -> a ~R# b… but right now I just want to be able to at least soundly manually declare something which is actually “replacing unsafeCoerce with a checked version” as best as reasonable pre-dependent types (Dependent types would obviously be wanted to represent that Nothing :: Maybe Int can be safely coerced to a Maybe Bool, though perhaps some crazy singleton tricks could work)

1 Like

I find it very hard to follow what you’re saying.


After investigating, I think I understand the type family issue. There is a strange discrepancy between a type family inside a newtype and a type family inside data. In the code (hidden) below, N and D really ought to behave the same way. In summary, if you have

type family F a where
  F Int = ()
  F Bool = ()

newtype N a = MkN (F a)
data D a = MkD (F a)

then you get

-- Not allowed
-- Coercible (N a) (N a') => Coercible a a'

-- Allowed
Coercible (D a) (D a') => Coercible a a'

-- Allowed
Coercible (N Int) (N Bool)

-- Not allowed
-- Coercible (D Int) (D Bool)

-- Allowed 
Coercible (F a) (F a') => Dict (Coercible (N a) (N a'))

-- Not allowed
-- Coercible (F a) (F a') => Coercible (D a) (D a')

You’re saying the newtype behaviour is better, and I’m inclined to agree with you. It’s a great surprise to me that the data behaviour differs.

Code
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE RoleAnnotations #-}

import Data.Constraint (Dict (Dict))
import Data.Coerce (Coercible)

type family F a where
  F Int = ()
  F Bool = ()

-- These seem to be the inferred roles
type role N nominal

type role D nominal

type role NI representational

type role DI representational

newtype N a = MkN (F a)

data D a = MkD (F a)

newtype NI a = MkNI a

data DI a = MkDI a

-- Unwrap

unwrapNF :: Coercible (N a) (N a') => Dict (Coercible (F a) (F a'))
unwrapNF = Dict

unwrapDF :: Coercible (D a) (D a') => Dict (Coercible (F a) (F a'))
unwrapDF = Dict

unwrapNIF :: Coercible (NI (F a)) (NI (F a')) => Dict (Coercible (F a) (F a'))
unwrapNIF = Dict

unwrapDIF :: Coercible (DI (F a)) (DI (F a')) => Dict (Coercible (F a) (F a'))
unwrapDIF = Dict

-- Fully unwrap

-- Not allowed
-- fullyUnwrapN :: Coercible (N a) (N a') => Dict (Coercible a a')
-- fullyUnwrapN = Dict

fullyUnwrapD :: Coercible (D a) (D a') => Dict (Coercible a a')
fullyUnwrapD = Dict

-- Not allowed
-- fullyUnwrapNI :: Coercible (NI (F a)) (NI (F a')) => Dict (Coercible a a')
-- fullyUnwrapNI = Dict

-- Not allowed
-- fullyUnwrapDI :: Coercible (DI (F a)) (DI (F a')) => Dict (Coercible a a')
-- fullyUnwrapDI = Dict

-- Concrete F

f :: Dict (Coercible (F Int) (F Bool))
f = Dict

-- Concrete

concreteN :: Dict (Coercible (N Int) (N Bool))
concreteN = Dict

-- Not allowed
-- concreteD :: Dict (Coercible (D Int) (D Bool))
-- concreteD = Dict

concreteNI :: Dict (Coercible (NI (F Int)) (NI (F Bool)))
concreteNI = Dict

concreteDI :: Dict (Coercible (DI (F Int)) (DI (F Bool)))
concreteDI = Dict

-- Wrap

wrapN :: Coercible (F a) (F a') => Dict (Coercible (N a) (N a'))
wrapN = Dict

-- Not allowed
-- wrapD :: Coercible (F a) (F a') => Dict (Coercible (D a) (D a'))
-- wrapD = Dict

wrapNI :: Coercible (F a) (F b) => Dict (Coercible (NI (F a)) (NI (F b)))
wrapNI = Dict

wrapDI :: Coercible (F a) (F b) => Dict (Coercible (DI (F a)) (DI (F b)))
wrapDI = Dict


I may be misunderstanding you, but it’s not possible to exclude Coercion. Coercion is just defined in “user-space” as a GADT wrapper around Coercible. It has nothing to do with the Coercion machinery per se.


I agree, and again I’m surprised.

type role Data representational
data Data a = MkData a

type role DataAny representational
newtype DataAny a = MkDataAny Any

-- Allowed
Coercible (Data a) (Data a') => Coercible a a'

-- Not allowed
-- Coercible (DataAny a) (DataAny a') => Coercible a a'

That means I can’t change my implementation from Data to DataAny without breaking consumers!


I agree you can’t do this. I don’t understand why you want to do this, though.

After investigating, I think I understand the type family issue. There is a strange discrepancy between a type family inside a newtype and a type family inside data. In the code (hidden) below, N and D really ought to behave the same way. In summary, if you have

I made an issue for this a while back in the ghc gitlab, since I too was confused by the difference. Alas, it seems to be a pretty big change to make them work the same (aka: would require changing Core), apparently.

2 Likes

The difference between newtype and data seems to make sense to me – is it not precisely why we can coerce Int to newtype Foo = Foo Int but we cannot coerce it to data Bar = Bar Int? (i.e. a genuine representational difference).

No, it’s a quite different issue. With

type family F a where
  F Int = ()
  F Bool = ()

data D a = MkD (F a)

there’s no soundness reason you shouldn’t be able to deduce

Coercible (D Int) (D Bool)

I suspect this is GHC’s internals (specifically Core) leaking.

Isn’t this just a simple consequence of requiring type family parameters to be nominal? (Lots of discussion at ghc#8177.)

(Unless you’re saying that said design decision is the leak, but from a re-skim of the linked issue it seems like the theory of how non-nominal type family parameters would work hasn’t been fully developed yet, which seems to me to be a more fundamental issue than ‘internals leaking’ might imply.)

2 Likes

It’s a great surprise to me that the data behaviour differs.

I think it’s mostly reasonable.
The most fundamental rule for Coercible (aside from the equivalence relation ones) is that two newtypes are coercible if their constructors are visible and their underlying representations are coercible.
That’s what gives you Coercible (N Int) (N Bool) and Coercible (F a) (F a') => Coercible (N a) (N a') and it’s why you cannot get Coercible (N a) (N a') => Coercible a a' because their underlying representations being equal does not mean that a ~ a' since F is not injective.

The other set of rules applies to both newtypes and data constructors, depending on their role. Since type families can pattern match on types, F a forces both to be nominal.
This means that Coercible (D a) (D a') is only true if a ~ a' (which also holds for N although it’s not the only rule there so the other direction doesn’t work). This is also why Coercible (F a) (F a') => Coercible (D a) (D a') doesn’t hold. Data constructors don’t have any rules that depend on their internal implementation and D has type role nominal.

That means I can’t change my implementation from Data to DataAny without breaking consumers!

Technically, the second instance would be fine if GHC allowed it whenever DataAny's definition is not visible. In that case, the same reasoning it uses for data types applies and if the definition is visible, this is an extremely dangerous breaking change anyway. Although I’m not sure how that would work since you could smuggle an instance from a context where the definition is not visible to one where it is and then get an unsound Coercible a a' instance

Realistically, GHC really shouldn’t allow Coercible (Data a) (Data a') => Coercible a a'. This kind of reasoning of “there is only one possible rule that could have applied so the other direction must hold as well” is often an issue for modularity and this is no exception.

And frankly, I would be surprised if anyone actually relied on this behavior today so it’s not like it would be a massive breaking change.

Exactly. If we remove that rule I believe it is the case that we can soundly model Coercible a b as "For any term v :: a, (unsafeCoerce v) :: b is equationally equal to a safely accessible term v' :: b. GHC may not always be able to prove the Coercible a b instances, but that is understandable. This restores intuitive modularity, since people can write unsafe code to add Coercion a b terms for any types that the default handling misses without worry. Maybe one day GHC adds the ability to unsafely declare outright Coercible instances that are used automatically. Maybe one day GHC learns to recognize how uninhabited types work with strict fields. Maybe someone instead writes a plugin to add these features. Maybe one day we get dependent types and can (coerce (Nothing :: Maybe Int)) :: Maybe Bool.

But the key thing is that all of this can be done soundly and intuitively.

I’ve actually been working at writing up a sort of paper formalizing the “equationally equal to a safely accessible term”, but I’m not that great at writing papers. The system I built does however have some nice properties like avoiding eta-expansion when reordering type arguments.

It seems to me that there are two distinct concepts under discussion:

Relation A: A relation defined between two types if and only if those types have identical sets of run-time representations.
Relation B: What Coercible currently represents—namely, a relation defined between two types if they meet the somewhat complex requirements defined by the documentation.

Relation B is a strict subset of relation A, which is a point in relation A’s favor. However, relation B is easier for GHC to reason about, which is a point in relation B’s favor.

Understanding Coercible to mean relation B, Coercible (Data a) (Data a') => Coercible a a' is entirely correct and appropriate (and it is Coercible (Data a Void) Void that is incorrect and unsafe, however it is expressed). When the type parameter to Data is representational, per the rules of relation B, there is an if-and-only-if relationship between Coercible a a' and Coercible (Data a) (Data a').

It may not be appropriate to infer this if Coercible means relation A. However, relation B is useful because of GHC’s ability to infer useful conclusions about it, and changing Coercible to mean A instead of B would be, at the very least, a breaking change.

Perhaps there should be another type class for relation A instead?

Another typeclass would work. Note that strict subset however is not the problem. As the paper I’ve been trying to write describes, the problem is that there are accessible rules that don’t hold for relation A.

GHC will not automatically derive every possible Functor. We can make up for this by manually writing functors. However, if GHC exposed a function Functor f => (a -> b) -> f a -> f b which was defined for any auto-derived functor, but caused undefined behavior on any functor that couldn’t be auto-derived, this would be an iffy function I think. Even more-so if it automatically replaced calls to fmap with it sometimes.

Here we would I suppose stop importing Functor, write our own Functor, have Prim.Functor f => Functor f, and then go about only using our Functor.

I claim the current situation is more like that. And as evidence, I point this out: When would you want to use CoercibleB as a constraint? We have CoercibleB a b => CoercibleA a b as sound. We have coerce :: CoercibleA a b => a -> b as sound. Having CoercibleB seems to primarily be a random hazard.

It is fine if GHC can’t fill out every possible valid instance of CoercibleA on its own. We can write our own unsafe code, plugins, etc, same as we do in every other case where GHC can’t do everything. That is what unsafePerformIO, unsafeCoerce, and so on are for. Maybe one day GHC will be so powerful that we can write a library like bytestring without any unsafe code. Maybe one day GHC will be so powerful it generates proofs of certain halting problems encoded at the type level to generate CoercibleA instances for us.

This is however actually a bit worse than the Functor case I think. One of the big things that is great about Coercible is the Coercible a b => Coercible [a] [b]. But it is actually a lot of boilerplate to represent the rules if CoercibleA and CoercibleB are separated. (forall a b. CoercibleB a b => CoercibleB (f a) (f b)) => forall a b. CoercibleA a b => CoercibleA (f a) (f b) is unsound because we can write data types like Coercion. This would then require us to whitelist all the types that aren’t like that. Which is most types! List. Set. Maybe. List of Lists. Almost every type would have to be whitelisted!

There’s no UB with coerce right now, as far as I know. You have to use unsafe functions to write something like voidCoercion.

When I want GHC to be able to do inference like Coercible (Data a) (Data a') => Coercible a a'. It can be inferred because of the rules in relation B, but GHC can’t make that inference based on the intensional definition of relation A. It’s a correct inference, so it’s good that GHC can make it and save us some potentially unsafe code to work around it.

I don’t understand what this means; could you elaborate? Coercion is irrelevant; it’s just a Dict specialized to the Coercible case, and so its meaning depends on what Coercible means. If you have forall a b. CoercibleB a b => CoercibleB (f a) (f b), that means exactly that f has a representational parameter, which I think is enough (for a human) to conclude that CoercibleA a b => CoercibleA (f a) (f b). (Whether GHC could infer that or not would depend on how CoercibleA is operationalized.)

And in the case with the Functor you would have to write unsafe code yourself as well. instance Functor Foo would be unsafe code, which would be surprising, but if you didn’t write such things you would be safe. Similarly, though to a blessedly lesser extent, it is rather surprising that unsafely conjuring up a Coercion (Data a Void) Void. This does have the benefit of requiring something as clearly marked as unsafe as unsafeCoerce, rather than making instance Functor Foo unsafe, but both are in fact sound as long as you don’t do certain things.

This is unsound. newtype Foo a = Dict (CoercibleB Void a) has a representational parameter. However, CoercibleA a b => CoercibleA (Foo a) (Foo b) does not hold.

More generally, every use of representational parameter that isn’t ultimately dependent on the representational parameters of CoercibleB itself I believe follows your rule.

Which is almost every representational parameter! And checking if it is such is extremely mechanical, identical to the current checking except assuming CoercibleB has a stronger role.

One could manually whitelist such, or write a duplicate role checker, but why? You could start with CoercibleA and add (forall x y. CoercibleA x y => CoercibleA (f x) (f y), forall x y. CoercibleA (f x) (f y) => CoercibleA x y) => class RepB f but when would you… actually want to use that?

Hence while CoercibleB is a subset of CoercibleA, CoercibleB x y => R is not a subset of CoercibleA x y => R. Without the extract rule, however, I believe it is impossible to prove that CoercibleA and CoercibleB aren’t the same type.

This does not make it harder for GHC to reason about CoercibleA. It means the solver will be incomplete, but it already is. As demonstrated, even with CoercionContext in the context, it will not infer forall x y. CoercibleB x y despite that being implied by the rules.

There is one snag where it actually makes it potentially a little harder for GHC in core to actually manipulate the coercions without this rule, as currently I believe it is actually used in case statements. So it would take some tweaks to core.

However, one could at least remove it from the surface language for now.

There’ve been a lot of Foos thrown around, so I don’t know what point you’re making here. If you implement the instance with unsafe code, then you’ve written unsafe code. If you implement the instance with safe code, or ask GHC to derive it (and it doesn’t raise an error in the process of doing so), there are no issues that I’m aware of.

That’s not a very convincing standard for soundness. Everything is sound as long as you don’t do certain things.

No it doesn’t!

{-# LANGUAGE RoleAnnotations #-}

import Data.Coerce
import Data.Constraint
import Data.Void

type role Foo representational
newtype Foo a = MkFoo (Dict (Coercible Void a))
Main.hs:7:1: error: [GHC-29178]
    • Role mismatch on variable a:
        Annotation says representational but role nominal is required
    • while checking a role annotation for ‘Foo’
  |
7 | type role Foo representational
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

But even so, let’s write something like it that does have a representational role and see whether there’s a problem:

type role Foo' representational
data Foo' a = Coercible Void a => MkFoo'

test :: Coercible a b => Foo' a -> Foo' b
test = coerce

GHC is happy with this. And why shouldn’t it be? It’s permitting you to coerce something that captures a coercion between Void and a into something that captures a coercion between Void and b if there is a coercion from a to b. The Coercible relation is transitive, so this is all right. Everything appears to be working as intended here. Where’s the unsoundness?

These are specifically incompatible is the issue.
CoercibleA (Data a Void) Void holds. There is a value of term Foo' Void. forall a b. CoercibleB a b => CoercibleB (Foo a) (Foo b) holds. According to your claim, CoercibleA a b => CoercibleA (f a) (f b). We have CoercibleA (Data a Void) Void. So we conclude CoercibleA (Foo' (Data a Void) Void) (Foo' Void). So we can turn that Foo' Void into Foo' (Data a Void), which is unsound!

Yes. That was my point. The current coercible handling is also sound as long as you don’t do certain things. I claim the rules for soundness are counterintuitive, as indicated by my first example, and you incorrectly believing the current rules soundly permit (forall a b. CoercibleB a b => CoercibleB (f a) (f b)) => forall a b. CoercibleA a b => CoercibleA (f a) (f b).

This is a long thread and I have not yet understood it in detail. But on this topic:

Everything is sound as long as you don’t do certain things.

Two thoughts:

  • A type system is “sound” if there are no programs that (a) pass the typechecker but (b) then fail with a segfault or something at runtime.
  • The “don’t do certain things” is actually singular “don’t do a certain thing”. That certain thing is to use unsafeCoerce.

If you use unsafeCoerce then arbitrarily bad things may happen and GHC makes no guarantees.

Question: is there a claim that GHC’s type system is unsound, without using unsafeCoerce? If so, can we see a program that will crash, despite passing the typechecker?

If the only claim is that GHC’s type system is unsound if you do use unsafeCoerce, that is absolutely to be expected. unsafeCoerce is, well, unsafe. That is why it has a long and scary name :-).

3 Likes

To save you the effort looking for one, no, there is no such claim in this thread. One of the claims is that the design of Coercible makes more uses of unsafeCoerce unsound than it “should”, for some definition of “should”, which we are trying to understand in the discussion.

(Not every use of unsafeCoerce is unsound, for example let id = unsafeCoerce :: a -> a. One implicit claim in this thread is that a different design of Coercible would support more such sound uses.)

4 Likes