Deep skolemization without implicit eta-expansion

I’ve been reading about the theoretical underpinnings of deep skolemization and simplified subsumption, with Practical Type Inference for Arbitrary-Rank Types and GHC Proposal 0287 as references.

The linked section of the proposal was initially confusing to me: with the definitions

f = undefined
g f = f `seq` 0

we expect g f to diverge, and it does. However, when we supply the type annotations

f :: forall a b. a -> b -> b
f = undefined

g :: (forall p. p -> (forall q. q -> q)) -> Int
g f = f `seq` 0

g f evaluates to 0, as the proposal says. These are not the types inferred in the first example, of course, but it was surprising to me that the static types would affect how f is evaluated to WHNF at runtime. (The behavior of g (\x y -> f x y) is unsurprising; as usual, seq breaks eta-equivalence.)

@jackdk suggested on IRC that with deep skolemization, f was implicitly getting eta-expanded. I buy that - it’s probably why the proposal emphasizes manual eta-expansion - but that left me wondering if this is actually a requirement of deep skolemization or simply an implementation detail.

In other words, can the definition of subsumption be enriched with deep skolemization (thereby admitting additional desired programs) purely at the type level without requiring implicit eta-expansion at the term level? The typing rules are syntax-directed, of course, but in Fig. 7, it seems that the subsumption relation is enriched via rules directed purely by type-level syntax.

4 Likes

Problem

I think the problem is that GHC needs to translate your source code to a form with explicit type applications. We need a helper function like this:

convert :: (forall p. p -> (forall q. q -> q)) -> (forall a b. a -> b -> b)
convert f @a @b x y = f @a x @b y

(Note that this syntax for binding type variables in function arguments is not currently supported in the source language, but it is part of Core and also part of the accepted Modern Scoped Type Variables proposal)

This convert function is doing the eta expansion to be able to shuffle around the arguments. It is possible to eta-contract this one time:

convert f @a @b x = f @a x @b

But that still requires one level of eta expansion.

Solutions?

I think there might be ways around this problem. For example it might be possible to use coercions instead of eta expansion:

convert = unsafeCoerce

(This requires ImpredicativeTypes.)

This is currently unsafe in GHC, but I could imagine GHC adding safe coercion rules for higher rank types which would allow you to use the safe coerce instead.

Another possible solution would be to change the type system we use. I personally think MLF is a very promising type system which supports more principled type inference for higher rank types (it gives us back principal types). In MLF, I believe you could write the type of g as:

g :: forall (b >= forall q. q -> q) (a >= forall p. p -> b). a -> Int

And the type

forall (b >= forall q. q -> q) (a >= forall p. p -> b). a

can be instantiated to both

forall a b. a -> b -> b

and

forall p. p -> forall q. q -> q
2 Likes

Thanks, I think this does get closer to the heart of my confusion, but I’m still not quite satisfied. Your convert function absolutely makes sense - this is essentially what SPJ seems to be referring to when he elaborates g f to System F. And I accept that actually doing the argument shuffle involved would require eta-expansion. What I’m not quite getting is why we actually have to convert.

Section 4.6.2 of the paper states that the types forall a b. a -> b -> b and forall a. a -> (forall b. b -> b) are isomorphic - indeed, convert and its inverse are the explicit isomorphism given in footnote 4. However, the issue is that Odersky/Laufer (ol) subsumption does not give us forall a b. a -> b -> b ≤ forall a. a -> (forall b. b -> b). dsk subsumption is meant to fix this.

So, okay, fine, now our two types are actually isomorphic in the dsk subsumption preorder. Let’s go back to typechecking g f. Speaking more colloquially now, it’s true that convert (well, convertInverse) would wrangle f into a form that g can accept. But why must we go this far? We already know the type of f is isomorphic to a type that g can accept. Maybe this isomorphism isn’t as strong a property as I think it is, but can we not just unify with the isomorphic type without actually converting f? Does this lead to unsoundness? Or is the obstacle simply expressing this notion in Core, for example?

Edit: I notice alternative #2 the bottom of the proposal mentions the possibility of changing GHC’s IR to be based on System F-eta instead, which maybe is equivalent to what I’m asking for? But I don’t think I’m asking for everything to be done modulo eta-equivalence, just that we judiciously leverage knowledge of isomorphisms we already seem to have.

2 Likes

I think this is it. Core is based on System F which has no subsumption rules at all.

System F-eta could make the eta expansion workaround work without changing the semantics, but indeed I think it is quite an invasive change to the language.

1 Like

Great, I think that more or less clears that question up for me. Looking at it another way, I suppose GHC/Core benefit a lot from System FC being an intensional type theory, so implementing F-eta/my version of deep skolemization is essentially asking for support for an extensional type theory to be plumbed through the compiler - unsurprisingly a non-trivial task. And in retrospect, it seems quite natural that if we require intensionality, the way we model subsumption has to require rewriting terms.

(Ah! I’ve also just realized that section 5.8 of the Quick Look paper summarizes much of this conversation.)

However, now the idea of using coercions has piqued my interest. I’m catching up on new Haskell developments after a few years’ hiatus, and it seems like simplified subsumption was a bit controversial. My understanding is that while deep skolemization was convenient (or rather that its removal painfully required adding manual eta-extension in various places), the major downsides were type system complexity and implicit eta-expansion changing semantics, as in this example. Well, the type system work has already been done once, so it sounds like if no major additional type system work is required, then a version of deep skolemization which admits the same programs but does not implicitly perform term-level eta-expansion would satisfy both users and e.g. Quick Look.

So as you suggest, could we have our cake and eat it too by continuing to base Core on System FC while abstractly working in something like F-eta or the DEEP-SKOL extension and simply inserting coercions as (asserted) proofs of type isomorphism (or other judgments) that Core cannot encode?

{-# LANGUAGE ImpredicativeTypes #-}

module Test where

import Unsafe.Coerce

type One = forall a b. a -> b -> b
type Two = forall p. p -> (forall q. q -> q)

convert :: One -> Two
convert f = \x y -> f x y

f :: One
f = undefined

g :: Two -> Int
g f = seq @Two f 0

h = g $ convert f

This works as expected, with h evaluating to 0 due to the eta-expansion. convert is basically the manual eta-expansion users had to add; the program will not compile without it.

But now instead write convert = unsafeCoerce. No eta-expansion is performed, so h now diverges as I expected at the start of this thread. My low-level knowledge is a little fuzzy here, but my intuition is that this is safe: f and convert f may interchange a type application with term application, but the type applications are gone in STG so the underlying function pointers can be used interchangeably. In general, “floating out” the foralls should never change the order of term applications, right?

So it seems that if GHC can derive dsk-subsumption judgments as before, then instead of performing implicit eta-expansion, it can use that to justify implicitly inserting a coercion, giving us the best of both worlds. Is this too good to be true?

1 Like