Hi all, it’s time for the first weekly update on the implementation of dependent types in GHC.
This week I mainly focused on #425 “Invisible binders in type declarations”. The proposal introduces @k
-binders in type
, data
, newtype
, class
, type family
, and data family
declarations:
type KindOf :: forall k. k -> Type
type KindOf @k a = k
-- ↑
-- new syntax
I dusted off an old branch where I had already modified the AST in the appropriate way. The data type in GHC responsible for representing type variable binders is HsTyVarBndr
:
data HsTyVarBndr flag pass
= UserTyVar (XUserTyVar pass) flag (LIdP pass)
| KindedTyVar (XKindedTyVar pass) flag (LIdP pass) (LHsKind pass)
In a data declaration such as data T k (a :: k) = ...
, the k
is represented as a UserTyVar
because it has no kind annotation, whereas a :: k
is represented as a KindedTyVar
.
You can see that HsTyVarBndr
is also parameterized by a flag
. In type declarations, the flag is unused, so it is instantiated to ()
. In forall
telescopes, the flag is instantiated to Specificity
and is used to distinguish between forall a.
and forall {a}.
(if this looks unfamiliar, see #99 “Explicit specificity in type variable binders”).
This flag field looks like the most natural place to store information about the visibility of a binder. First, we need a new data type to represent the choice between k
and @k
:
data HsBndrVis pass
= HsBndrRequired
-- Binder for a visible (required) variable:
-- type Dup a = (a, a)
-- ^^^
| HsBndrInvisible (LHsToken "@" pass)
-- Binder for an invisible (specified) variable:
-- type KindOf @k (a :: k) = k
-- ^^^
And now we can use it as our flag
instead of ()
. We got lucky that this type parameter was already there. Next, let us look at the parser. @
-binders are already part of the grammar, but they are rejected by a later validation pass:
ghci> data T @k = MkT
<interactive>:1:9: error:
Unexpected type application @k
In the data declaration for ‘T’
The code responsible for this looks like this:
check (HsTypeArg _ ki@(L loc _)) =
addFatalError $ mkPlainErrorMsgEnvelope (locA loc) $
PsErrUnexpectedTypeAppInDecl ki pp_what (unLoc tc)
It is not difficult to make it produce a HsTyVarBndr
instead of a parse error. This marks our first milestone: we can represent the new syntax in the AST and we can parse it from source.
Next up: Template Haskell support. At the moment, its AST is entirely separate from the one used by GHC, so we need to make the corresponding changes to it and update the conversion functions. Updating the TH AST is a bit of a headache, because if you get it wrong, you get mysterious errors when GHC tries to bootstrap itself. And since I’ve spent all my luck on the AST and the parser, this one just had to go wrong. In any case, I managed to debug it and all is well in the end.
The conversion functions in GHC.HsToCore.Quote
posed a bit of a challenge. It took me a while to disentangle the relationship between addSimpleTyVarBinds
, addHsTyVarBinds
, addQTyVarBinds
, and addTyClTyVarBinds
, and I initially couldn’t figure out the correct way to add support for HsBndrVis
to them. But after studying the code, I have come to the realization that addTyClTyVarBinds
was actually misused in two places, and those happened to be the places that were problematic to update. I corrected this in a separate refactoring:
- Commit 5d267d46:
FreshOrReuse
instead ofaddTyClTyVarBinds
The commit message explains it in more detail. With this refactoring in place, I managed to add TH support for invisible binders (mostly). There are still some issues around reification, but I decided to save them for later.
Then I decided to address the build failures I was getting in haddock
. Since I added some new information to the types (namely, the HsBndrVis
flag), Haddock did not know how to render it (i.e. convert it to HTML or LaTeX). When I began to update it, I spotted two other pre-existing issues that are not directly related to my changes but affect parts of the code that I was modifying. So I opened merge requests to fix those:
One of those is a minor refactor, the other corrects a serious bug in the LaTeX backend. I suspect that not so many people use the LaTeX backend if such an egregious error was not reported before.
Anyway, with those fixes in place, I added support for invisible binders to Haddock and moved on to the most intriguing part: type checking. When it comes to type checking left-hand sides of type declarations, there are three independent code paths involved:
- The kind inference case
- The complete user-supplied kind case (CUSK)
- The standalone kind signature case (SAKS)
Thanks to my prior experience with the implementation of StandaloneKindSignatures
, I knew what to do in the SAKS case right off. In fact, I specified it right in the proposal text in pseudo-code: see the SAKS zipping algorithm. So I went ahead and implemented the zippable
and skippable
conditions in matchUpSigWithDecl
.
The CUSK case is actually not specified in the proposal. Since CUSKs are a legacy feature, I thought I could simply avoid dealing with them by rejecting @k
-binders. Then I thought that it would be, perhaps, a bit more elegant to say that @k
-binders are not covered by the CUSK conditions, so we would fall back to the inference case. And then I realized that it might cause trouble in associated type families whose parent class has a SAKS, since we still rely on the CUSK code path to check them. Admittedly, this is something of an exotic edge case, but I would prefer to get it right, so I will need more time to think about it (and produce a compelling test case).
Finally, the kind inference case is a bit tricky, but doable. GHC is already capable of inferring forall
-quantified kinds and preserving their order relative to other binders: it does so for VDQ in kinds. If you declare data T k (a :: k) j (b :: j)
, its inferred kind will be:
T :: forall k -> k -> forall j -> j -> Type
This is nice, because it means that we have the basic infrastructure in place. If we want to handle data T @k (a :: k) @j (b :: j)
, we could use the same inference algorithm but remember that k
and j
should be quantified with an invisible forall
instead of VDQ:
T :: forall k. k -> forall j. j -> Type
However, I am yet to figure out a good place to store this information (my first instinct is to add it to TcTyVar
, but that might have unforeseen and far-reaching implications).
Some other things I did while working on #425 were:
- rebased my prototype onto
type data
TH updates - rewrote
HsTypeArg
to useLHsToken
(!9486) - identified a potential bug in
repNewtype
(narrator: it was not a bug) - opened #22560 and submitted !9480
In parallel with all of that, I continued my work on #281 “Visible forall
in types of terms”. The proposal extends the support of VDQ (forall a ->
) from kinds to types:
-- Definition:
idv :: forall a -> a -> a
idv (type a) x = x :: a
-- Usage:
n = idv (type Double) 42
This is something I began to implement a few months back (see #22326, !9257), but it was blocked on the outcome of a discussion (#558) about the best way to accommodate this change in GHC Core. And this week I’ve finally come up with an approach that might provide a reasonable way forward, which I called specificity subsumption. The gist of it is:
- in GHC Core we would not draw a distinction between inferred, specified, and required binders
- in Haskell we would have a separate check to reject programs that abuse this notion
I described this idea in a comment. The next steps would be to formalize specificity subsumption in typing rules and to sketch out an implementation in GHC.
Inspired by getting unstuck on this issue, I rebased the prototype and fixed a bug in it. The bug concerned non-linear type variable bindings:
f :: forall (a :: Type) (b :: Type) -> ()
f (type t) (type t) = ()
This program should be rejected since it has conflicting bindings for t
, but my prototype did not detect this. I fixed it by extending collectPatsBinders
with a new mode in which it would also collect type variables.
Lastly, I also dedicated some time to bug fixes and refactorings unrelated to dependent types. I opened a ticket about a long-standing issue about processing Haddock comments after parsing (#22558), described how it can be fixed in three steps, and started to implement those steps in !9473, !9476, and !9477. The approach looks promising and the first merge request of the three has already been accepted.
That’s it for this week. Let me know if you have any questions!
This work is funded by Serokell