What can pattern variables in type families match?

In type families, we can constraint the parameters to kinds with a limited set of types. For example, if we constraint the type parameter to Bool, then we can only apply it to True and False, right? But this type checks:

type family F (x :: Bool) where
  F True = Int
  F False = Int
  F (a b) = Int -> Int

Wait… what’s up with that last case? How is that valid? I would expect that the type pattern variable a could only match a generative type constructor, and any such constructor would not be in the Bool kind. The a cannot match a type family, because either it would be reduced, and thus not match, or it would be stuck, in which case F would also be stuck.

In function pattern matches, we can never have a pattern variable match a constructor - we can only use literal constructors (or pattern synonyms). But type families are a bit more flexible.

What kind does GHC infer for the third pattern match?

  forall k (a :: k -> Bool) (b :: k). F (a b) = Int -> Int

That’s a little abstract. Here’s a series of definitions that make this type increasingly concrete, while still satisfying the same case in the F type family:

f0 :: forall k (a :: k -> Bool) (b :: k). F (a b)
f0 = id

f1 :: forall (a :: Nat -> Bool) (b :: Nat). F (a b)
f1 = id

f2 :: forall (a :: Nat -> Bool) . F (a 1)
f2 = id

We can’t make it any more concrete. What if we did have a thing of kind Nat -> Bool? How about this type family?

type family G (x :: Nat) :: Bool where
  G 1 = False
  G 2 = True

But if we apply that, it reduces, so we match the F False case, which reduces to Int:

ghci> :k! F (G 1)
F (G 1) :: *
= Int

So it seems that there is something that our type pattern variable a can match other than a generative type constructor: a rigid type variable.

And that feels wrong to me, because there is nothing that could be filled in for that type variable! There is nothing of kind Nat -> Bool. The type family G is… but it can’t be used unsaturated. Maybe if we get unsaturated type families in GHC, we can have something of this kind.

Also I might think that subsumption would imply that the type of f2 would be a subtype of the type we’d get if we filled in a type for a. But filling in G for a makes F (G 1), which reduces to Int. And Int -> Int is not a subtype of Int. But I suppose subsumption checks only apply after we’ve already evaluated the type families. I guess what bugs me is that the result of the type evaluation seems to be dependent on the evaluation order, and the evaluation order isn’t defined.

That is, forall (a :: Nat -> Bool) . F (a 1) can be seen as a lambda at the type level (it is in dependently-typed languages), and if you first reduce F (a 1), then fill in G for a, you get a different answer than if you had first filled in G for a, and then evaluated F.

Can anyone clear up my confusion here? Should we really be allowing matching type family pattern variables against universally quantified variables?

2 Likes

Actually it is already possible to define

type family H :: Nat -> Bool

and then we get

ghci> :kind! F (H 3)
F (H 3) :: *
= Int -> Int

even though H does not have any equations (and cannot have a useful definition).

F (G 1) is not a substitution instance of F (a 1), because the unsaturated G is not a valid term. Thus it’s “fine” to have F (G 1) ~ F False ~ Int but F (a 1) ~ (Int -> Int), at least in the sense that substitutivity isn’t violated. It’s rather like the (whitespace) application operator is overloaded to have two different meanings depending on whether the head is an unsaturated type family or not. (If we had unsaturated type families, then you still could not directly substitute G for a :: Nat -> Bool, because the kind of an unsaturated occurrence of G would not be Nat -> Bool.)

This is all pretty weird. Unfortunately type families are not type-level functions, even though they look a bit like them at first glance.

4 Likes

Thanks for the informative reply, @adamgundry!

It’s rather like the (whitespace) application operator is overloaded to have two different meanings depending on whether the head is an unsaturated type family or not.

Well, we have one behavior with reducible saturated type family applications F (G 1) ~ F False ~ Int. A second behavior with stuck saturated type family applications: F (G 3) doesn’t reduce. And a third behavior with whatever you call the H type family or type variable applications: F (H 3) ~ (Int -> Int) and F (a 1) ~ (Int -> Int).

It again seems inconsistent to me that in F (H 3), H 3 is stuck but F (H 3) can be reduced to Int -> Int; and yet in F (G 3), G 3 is stuck and F (G 3) is also stuck.

F (G 1) is not a substitution instance of F (a 1) , because the unsaturated G is not a valid term. Thus it’s “fine” to have F (G 1) ~ F False ~ Int but F (a 1) ~ (Int -> Int) , at least in the sense that substitutivity isn’t violated.

OK, that makes sense to me.

If we had unsaturated type families, then you still could not directly substitute G for a :: Nat -> Bool , because the kind of an unsaturated occurrence of G would not be Nat -> Bool .

What would the kind of an unsaturated occurrence of G be, then?

This is all pretty weird. Unfortunately type families are not type-level functions, even though they look a bit like them at first glance.

I realize the operational semantics of Haskell terms is not even formally defined anywhere, so it may be asking a bit too much for documentation on the semantics of type families. But the GHC page on type famlies doesn’t explain any of the above. Nor does the wiki page on type families, nor the original paper on type families (unless maybe I missed on my glance through). I just want to get some kind of intuition for it. Is any of this documented, or do we need to just look at the GHC source?

Perhaps some different notation would make this clearer: suppose instead of f τ for both kinds of application we write f :$ τ for “matchable application” and G $ τ for “type family application”. As this notation suggests, type families can match on :$ but not on $. Thus the definition of F becomes

  F (a :$ b) = Int -> Int

so it may make slightly more sense that F (H :$ 3) reduces by matching on the :$ constructor, whereas F (G $ 3) is stuck until G $ 3 reduces.

In the unsaturated type families paper and proposal by @kcsongor et al. the arrows are annotated with M (matchable) or U (unmatchable) so G :: Nat -> @U Bool but a :: Nat -> @M Bool.

This is a good question. I think a lot of this is folklore, unfortunately. Type families arose out of a type inference tradition that was based on relations and logic (e.g. open type families came first, then closed type families were squeezed in later). Thus functional programming at the type level has always been somewhat second-class, because of the lack of a clear operational semantics. This is something of a running theme in my thesis, although re-reading it now I can’t see a single clear discussion of the issues. Perhaps it would be worth writing one…

Apart from the unsaturated type families work, a couple of other links that might be useful for further reading:

3 Likes

Perhaps some different notation would make this clearer

Yes, that makes it a lot clearer, thanks.

In the unsaturated type families paper and proposal by @kcsongor et al. the arrows are annotated with M (matchable) or U (unmatchable) so G :: Nat -> @U Bool but a :: Nat -> @M Bool .

Ah, that’s why the term “matchable” sounded familiar to me… I had read part of that paper before! Yes, that makes sense to me now.

Perhaps it would be worth writing one…

That would be great!

Apart from the unsaturated type families work, a couple of other links that might be useful for further reading

Thanks for the suggestions. Those are all interesting, and provide different insights into type families. However, I don’t think any of those (even the paper on unsaturated type families) point out exactly what is matchable, apart from type constructors. As we discussed above, the a in (a b) is matchable, as is the type family H, but neither of those examples are discussed in those sources as being matchable.