GHC+DH Weekly Update #4, 2023-01-18

Hi all, you are reading the fourth weekly report on the implementation of dependent types in GHC.

This week I focused on #22478 “Modern scoped type variables: type arguments in constructor patterns”, where the goal is to make name resolution in patterns consistent across terms and types. Compare these examples:

  • f (Identity (a, b)) = (Identity a, Identity b)
  • g (Proxy @(a, b)) = (Proxy @a, Proxy @b)

In f, we have term variables a and b, while in g we have type variables a and b. But the name resolution should follow the same principle:

  • We bind variables on the LHS and use them on the RHS.

The difference between (a, b) and @(a, b) should be kept to a minimum, as far as name resolution is concerned. In the absence of punning, there should be no difference at all. This is called the Visibility Orthogonality Principle (VOP), one of the documented Design Principles for GHC.

The current implementation of type arguments in constructor patterns fails to adhere to this principle in a few edge cases.

Edge case Sig concerns kind signatures, I reported it in #556 a while back. Currently waiting for the GHC Steering Committee to accept the amendment.

Edge case Forall concerns forall-quantification in type patterns, it is something I’ve discovered just a few days ago. Consider:

f (MkT @(forall a. Maybe a)) = rhs

Does the a in Maybe a refer to the forall-bound a, or is it a binder in and of itself? If we follow the principle that variables in patterns are binders, not usages, then the above example is equivalent to:

f (MkT @(forall a1. Maybe a2)) = rhs

And that is, I must admit, quite counter-intuitive. Shall we disallow forall-types in patterns? Perhaps, but keep in mind that a forall can be hidden behind a type synonym:

type S b = forall a. Either a b
f (MkT @(S b)) = rhs

Tricky, and not really specified in the proposal. The typing rules in the paper disallow polytypes in type patterns (by using instead of ), but I wish the proposal was explicit about this. Furthermore, the current implementation is perfectly fine with polytypes, although it has trouble with higher-order kinds (see #18986).

Edge case TH concerns Template Haskell splices. Do we want to accept this?

f (Proxy @($(TH.varT (TH.mkName "t")))) = Proxy @t

Yes, I would say. It should be equivalent to:

f (Proxy @t) = Proxy @t

Alas, GHC 9.4 rejects it with the following error:

<interactive>:9:13: error:
    • Not in scope: type variable ‘t’
    • In the untyped splice: $(TH.varT (TH.mkName "t"))

In order to account for all the edge cases, I intend to reimplement name resolution logic for type patterns and make it mirror term-level patterns (rnPat) more closely. Furthermore, large parts of the new implementation could be reused in !9257 “Visible forall in types of terms”, so I would be making progress on two issues at once. In particular, I intend to reuse:

  1. rnHsTyPat, a new subroutine to rename types binding each type variable to a fresh name. The existing subroutine, rnHsTyKi, does a lookup in the environment – this is not what we need!
  2. collectPatsBinders, which I generalized to collect type variables in binding positions in addition to term variables.

A very rough draft is available at !9747. I am still in the process of thinking and prototyping, so it is likely to change significantly. However, I believe that I have finally found the proper implementation approach that can account for the aforementioned corner cases – now it’s down to technical details.

Another direction of work is visibility subsumption. There was a bug report this week (#22762) that can be fixed with visibility subsumption, so this prompted me to get back to my draft MR !9594 and to try to address the review comments. I am a bit stuck on this one. Firstly, it seems to me that Note [Equality on AppTys] is out of date, although it might be that I simply misunderstand it. Secondly, it is still not clear to me what to do when the equality is deferred to the constraint solver.

I also remembered that according to #143 Remove the * kind syntax, we ought to enable -Wstar-is-type by default in GHC 9.6, so I submitted a patch to do so:

  • Commit e9c0537c: Enable -Wstar-is-type by default

That’s it for this week. Don’t hesitate to ask any questions!


This work is funded by Serokell :purple_heart:

10 Likes

While this particular example seems like a no-brainer, what if we instead had:

f (Proxy @t) x =
  case x of
    Proxy @($(doSomeStuff)) = Proxy @t

In this context, should my code really break when an update to doSomeStuff calls the name it generates t instead of s?

I guess if we’re fully committed to not having hygiene, the answer is “yes, and doSomeStuff should really only bind gensyms”.

Edit

I’ve now had my morning tea, and I just realized that your question was really “should splices be able to bind type variables at all?”. It seems to me that the answer should be identical to “should splices be able to bind term variables?”. Sorry for the distraction.

Good question; but I think the answer is straightforward. I strongly think that the a in Maybe a is bound by the forall. Anything else would be deeply strange.

The principle should be that: the free variables of patterns are binders, not usages. Easy!

Perhaps update the proposal and/or principles to clear this up?

2 Likes

@simonpj I initially told @int-index much the same thing, but after a few rounds of Q&A I convinced myself to go much more conservative.

f (MkT @(forall a. Maybe a)) = rhs works today, but gets us into a function decidability morass as soon as the type level language gets more expressive (e.g. unsaturated type families and beyond).


Here is a thought experiment. Suppose that instead of writing

forall a. blah

we wrote

forall \a -> blah

the idea being that forall is a special data constructor forall :: (k -> Type) -> Type. Now we’ve desugared one binding construct into another.

But lets return to our pattern now; it becomes:

f (MkT @(forall \a -> Maybe a)) = rhs

Now we have a lambda in a pattern — hopefully this makes it clear why I said “function decidability morass” above.

I simply rather not get into that at this time. I rather be very cautious and only allow:

  • type constructors
  • data constructors
  • applications of the above
  • variables

in @ patterns. This means we can use the same grammar at the type and term level, and that is in keeping with the Syntactic Unification Principle.

Type synonyms and Type Families would not be accepted. (We could someday make type patterns synonyms that would be, those would preserve these rules in their bodies.)


Yes, there are all sorts of tricks we can to accept more programs, cobbling together some theory of function judgemental equality in the process. But we can always punt on that and do it later too.

At the very least, if we do accept some “fancier” types, I would want it behind a separate extension. After all, the users are clamming for a more stable GHC / easier upgrades. I would want to keep the dirt-simple cases separate, because we can be much more confident in their stability, and not have the fancier cases “infecting” that stability guarantee.

I think it is much simpler than you suggest. GHC already can perform unification on foralls and does so without problem. It’s quite different to higher order unification etc. No biggie.

This is all very simple. The pattern K @ty ... binds the free variables of ty. Easy peasy.

Unification under foralls only is easy today because the type language is not as powerful as it might become. We will run into thorny things there too, eventually.

Also the unification we do today is somewhat hidden, a regular user writing haskell may have no idea what unification is, except for error messages. By contrast, pattern matching is explicit and a very visible feature. I just want to be extremely cautious with it.

Remember also (I am forgetting which thread) you proposed that we deprecate pattern signature binds in lieu of @ patterns? I think that is great! But I wouldn’t want to tell users to leave a very old extension for something with some more uncertain edge cases.

You say it is easy, I believe it is easy today. But we’ve (myself certainly included!) been stumbling over this the exact details for months/years. I really want to not just reach for expressive power, as we are used to, but be very cautious and uphold as many principles as possible (and I think there are some additional principles on pattern matching that we should codify too).

To give an example, suppose we could do this:

f (MkT @(forall x. case x of True -> Maybe a; False -> [b])) = ...

Yes, if we don’t bind x and do bind a and b as you say, and a and b are not bound to something with x free inside them, it can work. But this is weird! I feel absolutely no urgency to support this, and the most principled way I can think to ban it is just not allowing any forall x. ....

Fundamentally function bodies (and Pi type bodies) are expression not values, and so they are a poor fit for pattern matching.

And conversely, appealing to the restriction that appear at the type level today that allow eliding this expression vs value distinction feels like going against the Syntactic Unification Principle.

Once you have lambdas and case expessions in types, everything goes out of the window. Vast numbers of existing things will stop working. This is quite orthogonal to forall, which is Vlad’s question.

The points you raise are valid ones for a possible future, but should not stop us making simple progress now.

Once you have lambdas and case expessions in types, everything goes out of the window.

OK at least we are agreed on the facts. :slight_smile:

This is quite orthogonal to forall, which is Vlad’s question.

The points you raise are valid ones for a possible future, but should not stop us making simple progress now.

Well consider how whenever a Dependent Haskell proposal it is written, care must be taken to make it not “fork-like”. We can indeed ignore this possible future now, but we are also setting ourselves up for more contortions later.

To me, it is an odd thing to wish the person working giving a “GHC+DH” weekly update (Vlad) to do a thing which will just cause more Dependent Haskell headache down the road!

@goldfire’s thesis did include some affordances to make sure existing programs did not have worse type inference in a “pay for what you use” fashion, but I don’t think those plans included a distinction in what the bodies of foralls look like. There was a matchable vs unmatchable arrow, but nothing like this distinction (about the body of the forall type itself, not that of its inhabitants).

I think the most agnostic approach we can do now is to, sure, have forall in patterns because it works in today’s language, but put it beyond a second extension. Monotype patterns are much easier and compatable with all possible futures — the truly orthogonal choice. Polytype patterns are only compatibile with some possible futures and therefore not really orthogonal choice.

Then everyone gets what they want: users uninterested in dependent futures get as much expressive power as we can give them today, users planning on using that stuff later and not wanting to write code that may become hard to support get a way to confine themselves to something more minimal and long-term stable.

Edge case TH concerns Template Haskell splices. Do we want to accept this?

f (Proxy @($(TH.varT (TH.mkName "t")))) = Proxy @t

Yes, I would say. It should be equivalent to:

f (Proxy @t) = Proxy @t

On a separate note, I think we should continue to keep the positive and negative position TH datatypes separate. The quote versions [e| ... |] vs [p| .... |] have different renaming logic, which will get more confusing if the data type being produced is in fact the same.

Either that means just adding more things to the existing pattern one so it is good for term or type level patterns (yay Syntactic Unification Principle!) or make a new “type pattern” one to complete the table of (type | term) * (expression | pattern).

If we have lambdas and case expressions in types we have to do a massive rethink. This makes it no worse. It’s like worrying about a future pinhole that will show up if there is a future hole the size of a tank. I just don’t think it’s worth our scarce cycles.

Life is short.

Fair enough. Should type families applied to free (so they would become binder) variables be allowed?

Ah now that is an excellent question.

type family F a
data T c = MkT (c -> c)
f @a (x :: T [Int]) = case x of MkT @[F b] h -> ....(True :: b)...h [1,2::Int]...

Here we bring b into scope, with the constraint [a] ~ [F b], which doesn’t tell us what b is; but it tells us a constraint on b. It will presumably also bring h into scope with type F b -> F b.

Then later we learn from (True :: b) that b ~ Bool. So the constraint checks that Int ~ F Bool.

If there wasn’t the constraint on b to tell us that b~Bool, just the call to h, we’d be stuck applying a fucntion of type [F b] -> [F b] to a [Int], and would report an error because we don’t know what b is.

Good question!

1 Like

@simonpj I think [a] ~ [F b] is a wanted constraint, at best. F is “sneaking in” as a use and not a bind by virtual of being a capital letter, yet F is not matchable like other capitalized things. (Our renaming rules are based on capitalization, but our semantics care about matchability.)

Quite arguably this means we should ban the use of non-nullary type families.

Type synonyms are only OK insofar that they are reserversable, and thus similar to pattern synnonyms as @AntC2 is fond of saying. I am not found of “exposing” that for similar future compat reasons, but if we are not doing with that then it is no worse than forall.

I see @simonpj is careful not to say “binds” b. I guess the (True :: b) counts as a usage of b, not a binding(?)

The Lexical Scoping Principle requiring a (uniquely) identifiable binding site for every tyvar seems to be on the ropes.

No, it’s all quite uniform actually – and I think this is how Vlad’s patch works. Consider:

f :: Maybe t1 -> t2
f x = case x of Just @[Tree a] y -> blah

What happens is this:

  • Start with the free variables of the @ty pattern
  • Make a fresh unification variable, alpha for each such free variable. It stands for an as-yet-unknown type.
  • Bring a into scope, bound to alpha.
  • Emit the Wanted constraint t1 ~ [Tree alpha]

Job done. Notice: no “matching” etc. Just emit a wanted constraint. After all, this might be the moment when we discover (more about) the enclosing type. E.g.

-- No type signature for g
g x = case x of Just @[Tree a] y -> blah

At the time we encounter the case-expression, we just have x :: beta, where beta is a unification variable standing for some as-yet-unknown type. By matching against Just we discover beta ~ Maybe gamma. But we still don’t know about gamma. But when we come across that @ [Tree a] we emit the wanted constraint gamma ~ [Tree alpha]. (We still don’t know what alpaha is.)

All this works completely uniformly if there are type families are involved.

I see @simonpj is careful not to say “binds” b. I guess the (True :: b) counts as a usage of b, not a binding(?)

Actually I wasn’t careful not to say that. It does bind b (to a unification variable, as always), exactly as I describe above. The LSP is in full force.

Nothing difficult here I think. But helpful to lay it out. Vlad is writing typing rules that will specify all this rigorously (thank you Vlad).

1 Like

Thanks Simon, then I completely misapprehended the whole swarm of LSP proposals – including the (later deferred) one to allow let introducing tyvars on LHS of function/instance decls.

I thought from “Lexical” they were about the binding-point for tyvars appearing in the program source, not invisible unification variables. If I infer beta ~ Maybe gamma and gamma ~ [Tree alpha] and alpha ~ a and a ~ Bool (from the RHS of the equation), I’m none the wiser as to the binding point for gamma, alpha or a. And I’m ‘getting away with it’ for beta because it appears once only.

Anyhoo don’t worry trying to explain it. I concluded from all that LSP stuff I’m too stupid to be allowed to use this DH doo-hickey – which I no longer count as Haskell. Good luck to Vlad, and perhaps if I keep reading the Updates, someday the penny will drop.

I just want to chime in to agree with @simonpj here. The variables bound in a pattern type application are the free variables of the type. They are bound to fresh unification variables. Then these variables are unified with the types that the applications correspond to, via a Wanted constraint. It’s all very easy. :slight_smile:

Re higher-order unification: yes, that is a problem, but it’s an orthogonal one to having type applications in patterns. We’re just going to do the same unification here we do everywhere else.

1 Like