RoleAnnotations is a Band-Aid on a poor decision

Forgive me if this is already common knowledge or has already been thoroughly discussed, I am not very proficient in Haskell and I am new to discourse.haskell.

One go-to motivating example for the RoleAnnotations extension is a Set a type with a private constructor where we’d like to restrict a to be nominal in order to prevent type-safe coercions that could nonetheless break internal invariants of Set. Coming from the Rust world, my first thought was that a should be nominal by default if Set has private constructors. I was told by a friend that “Haskell doesn’t have a Rust-like notion of privacy” and therefor this restriction was “icky”, which I accepted until I learned that this same logic applied to newtypes:

(from Data.Coerce’s documentation)

instance Coercible T b => Coercible NT b
This instance is only usable if the constructor MkNT is in scope.

Surely, a similar rule could theoretically be applied to type constructor coercions, something along the lines of:

… let D be a prototypical type constructor (data or newtype) with three type arguments, which have roles nominal, representational resp. phantom. Then there is an instance of the form
instance Coercible b b’ => Coercible (D a b c) (D a b’ c’)

  • (!!) This instance is only usable if all constructors of D are in scope or if D has an explicit role annotation.

Of course, I can hardly blame the developers of GHC for not accounting for this, and implementing this restriction as the default would constitute a major breaking change.

My broad questions are:

  1. Is this a fair assessment of role inference? Can this suggestion even be feasibly implemented? What dark corners am I neglecting to consider?
  2. What, if anything, can be done about this?
2 Likes

A quick search using:

haskell "role annotation" safe coercions

listed this article:

…does it help to answer those questions?

2 Likes

Section 6.4: “The role of role inference” and the linked mailing list is very helpful, thank you! Reading through it and will post thoughts later.

Reading that mail thread, it seems that the main argument against switching to nominal by default was that it would be hard to support the three latest GHC versions at that time.

But now GHC2024 will even include role annotations by default, so perhaps then we can also move towards making them required for data types that don’t export their constructors.

I think this could solve major annoyance with Safe Haskell, namely that coercions and some deriving strategies are considered unsafe.

Edit: I have changed my mind after reading more of the discussion and thinking it through more.

The main argument against nominal by default is that nominal parameters are by far in the minority. So almost everyone would have to write manual annotations in many places.

I was thinking GHC could assign nominal roles if the constructors of a data type are not exported. However, many libraries do expose the constructors in a .Internal module. So just looking at the immediate module is not always enough. And looking through multiple levels of reexports is not feasible.

Furthermore, it is also not feasible to determine roles at use-sites. At use-sites we do know which constructors are in scope, but the type we are interested in might contain other types for which we would also need to check the roles. Doing that recursively for all types could be a significant compile-time cost.

1 Like

Heh heh, that paper also got linked on this current thread. Yes, section 6.4 and the ghc-devs thread shows there were really gnarly decisions about defaulting roles.

Edit: (in response to @atravers raised eyebrow) I mean the Ghost of Departed Proofs paper at the link, section 1.6 relies on roles/coercions.

Something I wonder about nominal roles; take the type role Map nominal representational example in section 3.1. We have

data Map k v = Leaf | Node k v (Map k v) (Map k v)

We don’t want to make k representational (that is, we want to forbid coerce on anything used as the first arg to a Map type), because

It would be disastrous if the user were allowed to coerce from (Map Age v) to (Map Int v), because a valid tree with regard to the ordering of Age might be bogus when using the ordering of Int. Functions that manipulate Maps use an Ord k constraint …

Well the Haskell of approx a million years ago (the Haskell 1998 standard) used to support telling the compiler about that:

data Ord k => Map k v = Leaf | Node k v (Map k v) (Map k v)
--   ^^^^^^^^

Section 3.2 last para says

the parameters of a type class are assigned a nominal role.

So our role inference would say: any type args appearing in datatype contexts must be nominal – at least by default/a role decl could override that (Section 3.2 again). Furthermore the Ord k constraint is useful documentation – even if we don’t try to implement any of the behaviour that used to follow from the context under H98. (For the youngsters out there, datatype contexts are/were quite different to GADTs – which carry the constraint inside each constructor.)

Yeah, I had the same idea during the discussion at Pre-proposal: Undeprecate DatatypeContexts · Issue #611 · ghc-proposals/ghc-proposals · GitHub.

Also, datatype contexts have the advantage that GHC could infer nominal roles for them without needing explicit RoleAnnotations.
[source]

But I don’t see a way to make the rest of datatype contexts work (e.g. make it possible to define a Foldable instance for data Ord a => Set a) without major changes like the partial type constructors work proposes.

Yeah, I wasn’t trying to re-litigate that discussion. That’s why I said not to try and implement any behaviour. I think datatype contexts as no more than structured/compiler-checkable documentation would be more helpful than a role annotation – which gives no explanation why k is nominal [**]. Furthermore since we’re already looking at the data decl to determine the phantoms, everything’s neatly in one place.

[**] And it’s a pain having to declare the role of all the type args, when for most of them, the default is fine.

The key reason we need role annotations is to prevent unsafety. Its a sort of “nice bonus” that this same mechanism can also be used to encode certain other sorts of structure hiding as well.

In general, when we design an opaque structure in haskell, we have to go out of our way to not export constructors but only “smart constructors”, etc. In such a situation, adding a “nominal” role is one more element of this process. So I think the current default of representational is much more natural.

Yet again, language features are being used or added in an attempt to solve an implementation problem - GHC’s inability to elide redundant calls to and matches on newtype data constructors in Core:

So why can’t they be removed just before the code generation process, when presumably all Haskell types are erased from the code being compiled?

The language feature you are discussing is coercion not annotations. But in general, coercion can occur reaching “very deeply” inside a datastructure – there’s no necessary infelicity in implementation here. Optimization is opportunistic and expecting an optimizer to recognize an arbitrarily complex no-op is unrealistic, not just due to typed core but in general.

Reading the mailing list discussion has given me a lot more appreciation for the true thorniness of this situation. It’s no longer clear to me whether GHC developers made the “wrong” decision to always infer the most liberal type-safe role, and it was of course incorrect for me to imply that they were not aware of this issue.

One thing that sticks out to me looking at the timeline is that GeneralisedNewtypeDeriving was discovered to be unsound before its official release in 6.8.1, as #1496 seemingly predates it. Why was the team so keen on breaking as little GeneralisedNewtypeDeriving code as possible (instead deciding to push the breaking change onto individual library authors, to my eyes), considering it was known to be unsound, and was an extension at that? This is evidently a gap in my understanding of the Haskell ecosystem.

The version history in the current GHC manual doesn’t seem to tell the full story. There are references to generalized newtype deriving going back to at least 5.04: Generalised derived instances for newtypes. Edit: Ah the release notes of 5.04 also lists it as being introduced in that version:

  • It is now possible to derive arbitrary classes for newtypes. See Section 7.9.

Maybe it is just the extension flag, not the behavior, that was introduced in 6.8.1?

1 Like

By “just before the code generation process” I meant that GHC’s optimiser (its simplifier) had run to completion, and the resulting code was ready for translation out of Core into the next (more imperative) intermediate code - STG, JSTG, et al. So at this point coerce calls which are part of a larger “chain of applications” e.g:

... (snaffle (coerce (obsplain (...

surely can then be removed:

... (call_snaffle (call_obsplain (...

Right. But even if you could always do that, that wouldn’t suffice to eliminate all uses of coerce and all need for it. That’s just a single missed optimization possibility.

But if the majority of them could be eliminated, the few obscure ones that are missed could be treated as errors - programmers can then revise their rejected code accordingly (ideally this should be no more onerous than having to manually specify role annotations). Doing this also means those few obscure ones will decrease as future advances in code-examination techniques can be used to eliminate more of them.