Need feedback on pattern signature behavior

Hi Haskellers, In the conversation for GHC Proposal #608 I am making some claims on what is popular / what people prefer regarding pattern synonyms. It’s hard to substantiate those claims so therefore I ammaking a poll to get to the bottom of things :).

For background, GHC proposal #448 is the master plan of an effort to make an alternative to ScopedTypeVariables, which fixes a few issues. On a few occasions, we’ve made new proposals to modify it based on experience in other proposals and experience implementing it. GHC Proposal #608 is one of those proposals.

#448 is reintroducing PatternSignatures (which does not currently exist, but use to) in order to enable just that feature without the rest of ScopedTypeVariables. The question is whether that minimized extension should include the implicit binding functionality of pattern signature variables we have today (described below) or not.

(Also, an important disclaimer: we are well aware that ScopedTypeVariables is one of the most popular and widely-used extensions today. For those of you worried about breaking changes, no one is advocating them / we actually get rid of ScopedTypeVariables — it is easy to support it and old code using it indefinitely. So don’t worry about breakage.)


OK, now for the question itself.

Today, in code like

f (_ :: x) = ...

the x inside the pattern signatures can be either a use of an x bound in an outer scope, or a fresh binding of x if no such x is bound in an outer scope.

Note that originally this was added because it it was the only way to bind existential type variables, but now we have a new way with @ which is an explicit binding construct.

The question is do people like this?

  • Don’t care, I don’t use pattern signatures so it doesn’t affect me.
  • Don’t care, I do use pattern signatures but I still don’t care.
  • No, I don’t like implicit bindings, I want to look at code and always know whether something is binding or using variables without consulting surrounding code. If I need to bind a type variable I will instead use @ and write something like
    f @x (_ :: x) = ...
    
    (the @x binds the variable, the second x uses that bound variable.)
  • Yes, I do like implicit binding. I don’t want to have to use this @... syntax to bind type variables.

0 voters

2 Likes

I’m coming across this a lot these days and I’d really like explicit bindings. It would make so much sense when taken in the context of TypeApplications and type variable bindings in GADT pattern matches.

EDIT: @Ericson2314 Is there some where I should be commenting if I want to make the case that explicit @ type bindings are the way forward? Pretty much every day recently I want that over implicit bindings through type signatures on variables.

2 Likes

Yes https://github.com/ghc-proposals/ghc-proposals/pull/608

Do note we’re getting @... bindings regardless. The question is with @... existing what should a new PatternSignatures language extension include. There are some very subtle interactions with singled namesapce / “Dependent Haskell” in favor of no implicit binding with PatternSignatures that the conversation discussed; I left those out of the poll because I think it’s better to ask the crowd on current pain/pleasure than more hypothetical situations.

Good point. I voted ‘I do like implicit binding’; which doesn’t mean I would avoid @... or would object to others using it; but sometimes @... is awkward to use, I’d prefer implicit binding. (See an example which uses both modes on my recent post to the github thread.)

It might be worth noting the swag of proposals is also reintroducing [**] result signatures, so @maxigit’s example could go

f (x :: y) :: y = x 
--         ^^ result type on LHS

At the moment in GHC, PatternSignatures does the same as -XScopedTypeVariables, so you’ll get a warning PatternSignatures is deprecated. The chief purpose of -XScopedTypeVariables is for explicit forall in the standalone signature to scope its type variables over the equations for the function, so in f :: forall y. y -> y; f (x :: y) = ..., the :: y would be a usage not a fresh binding.

[**] That is, reintroducing from over a decade ago.

I think the reason it would be rejected with explicit binding is that y would simply be out of scope, not that the two ys are different. If there was a y in scope then it would be well-typed, e.g.:

g :: forall y. y -> y
g = f where
  f (x :: y) = x :: y

That should be accepted with both implicit and explicit binding.

1 Like

I mean both ys are out of scope. I don’t think it makes much sense to say they don’t refer to the same thing, because they don’t refer to anything.

If you think the only purpose of a pattern signature is to bind a new variable then yes, but I don’t agree. Pattern signatures can also just be used to specify the type of the argument using a known name. For example you could also write this:

g :: forall y. y -> y
g = f id where
   f (id' :: forall z. z -> z) x = id' x

You won’t be able to write that without a pattern signature. You could also do it with a type signature on f, but then you’d also need to explicitly state the type of x and the return type of f.

2 Likes

@maxigit Yes the old way of having to use another type signature to bind the variable was subpar, but that is why we are getting @. The new way to write a fully explicit id function without a standalone type signature is

id @a (x :: a) = x :: a

both a in signatures are uses not binds; the general situation for church-style lambda calculi has always been that when we introduce a variable, we give it a type using only already-bound variables, whether this is with \(x :: Int) -> ... or (x :: a) -> ... (polymorphic type variables like a and ground types like Int have the same scoping rules in a simpler language).

The @a binds the a because unlike the other as, it is a pattern, rather than merely being used in a pattern.

1 Like

You’re always going to have that inconsistency, because you can’t pattern match on types. For example:

data A = B

f (B :: A) = ...

Here B is a pattern, but A is a constructor.

I find it easier to have a mental model where all types are always constructors, rather than having some types that behave a bit like pattern matching sometimes.

1 Like

@maxigit

I understand the reason why it might be necessary but it feels strange that the first x and the first a behave differently. Remembering why seem a unnecessary cognitive burden.

Also consider this:

id @a = id2 where
   id2 (x :: a) = (x :: a)

both a after the :: are uses in this case, the @a is above, part of a separate definition. I wouldn’t mean the :: a in patterns binding if it was always that case, but as you can see it is not. As @jaror shows for upper-case identifier it is always a use, and as we both showed, it is also not the case for lower-case identifiers when the variable is in scope.

The new world I am hoping to get too has this simple model:

  • If it’s to the right of a :: it is always a use
  • if it is to the left of a ::, or there is no :: it depends:
    • if it is an expression (positive position) it is always a use
    • if it is a pattern (negative position) it is always a bind

This makes for:

  1. a more rigid expression-pattern duality — it doesn’t apply to as much stuff (not to the right of the ::) but where it does apply it always applies, no funny corner cases!
  2. the rules are strictly local, no outer scopes / enclosing code needs to be consulted to understand what each bit of code means.
1 Like

Yes this is an important objective for me; and is the main reason I’m unhappy with the forall in a distant stand-alone signature scoping over the equation. (For one thing it means that signature isn’t stand-alone, maybe doesn’t mean the same as a version without forall; but maybe does mean the same, depending whether the forall’d tyvars show up amongst some equation.)

We already get enough complaints (including very recently) that Haskell is too purist/academic. Then do I need to read Church before I’m allowed to use Haskell? You can write perfectly good Haskell with no lambdas at all. (And “general” lambda binds terms only, not types.) So following some arcane academic theory is no objective for me.

Then my mental model is:

f (x :: a) = ...

Binds the term var x and binds the type var a at the same time at the same place.

f2 (x :: Maybe a) (y :: a) :: Either a b = ...

Binds two term vars and the type var at the same time in the same place. The theorists can worry whether the appearances of a are bindings vs usages. And the b must also be a binding, from its position – which is why I’m keen to support result type sigs.

Decorating with @... then doesn’t avoid the need for signatures and is just clutter. (I’m not objecting if people want to clutter up their code.)

2 Likes

Yeah, all examples I’m giving are simplified. I hope it is possible to imagine real world scenarios which are not redundant. Of course you could object that those don’t exist or are not common enough, but that is a lot harder to prove or disprove.

3 Likes

In the case of a top-level binding, yes. But the rules are non-local if you consider local bindings, e.g.:

data X = X
data Y = Y

g :: X -> (X, X)
g (_ :: a) = (f X, f Y) where -- error: Could not match expected type 'X' with actual type 'Y'
  f (_ :: a) = X -- reuses the existing binding

h :: X -> (X, X)
h (_ :: b) = (f X, f Y) where -- works
  f (_ :: a) = X -- binds a new variable

Now the type of the f inside g is X -> X and the type of f in h is a -> X. So whether the type variable a is in scope or not, which may be determined by some code far away in a real world scenario, influences the type of f (_ :: a) = X.

Requiring explicit bindings avoids this spooky action at a distance.

I totally agree with this, but I feel this is a separate matter.

1 Like

Note that “the first occurrence is a bind” is not how value bindings work. Value bindings have to be linear (no two variables with the same name in the same binding group) and inner bindings shadow outer bindings if they have the same name.

Neither of those two properties hold for implicit binding of type variables (as currently implemented in ScopedTypeVariables), and in my opinion for good practical reasons as long as we don’t have explicit bindings.

However, we can get both properties back if we exclusively use explicit bindings, which will be possible with the TypeAbstractions extension in the future.

2 Likes

@maxigit So at the end of the day this a poll, you can have what ever opinion you like! I just want to make sure we are at least understanding each other, not that we agree necessarily :slight_smile:

Why can’t the scoping rules be the same for type and value ?

From my point of view, that is what I am trying to do: fix the rules so they are the same!

That the bit I really don’t get. How is it a problem in practice ? In (x :: a) = x :: a, that doesn’t pose a problem to anybody to have to look at the first x to understand the second x, so why is it a problem for the as ???

Let’s step back a bit.

  1. When I read the x or the a to the right of the =, I can no for sure this is a variable use. My brain goes "OK, new variable, better go find out what it refers to, or remember it for when I encounter the binding later.

  2. When I likewise encounter the x to the left of the =, I likewise can go “OK, found where x is defined, I don’t need to keep on looking”.

  3. When I encounter the a to the left of the =, I can make no such call. Sure, I’ve found something that is of type a, but I have no idea what a is — further out there could be other usages, or something that fully constrains it, like a ~ Int. I have no choice but to keep on searching to learn more about a.

Again as @jaror says, scale matter. id (x :: a) = x :: a is fine, but

f (SomeGadt :: Foo a) (x :: [a]) = 
    ... -- 30 lines!








  where
    ... -- 30 more lines!









    x :: Maybe a = ... -- pop quiz, is this `a` a bind or a use? 

is a lot harder

So that’s what I don’t like about implicit binding: reading left-to-right/outside-in, it’s alright, but reading right-to-left/inside-out, it’s a lot harder. And most of what I do is reading code that I did not write, random accessing all over the codebase (within modules and between modules). Therefore making reading as easy as possible, even at the expense of a few keystrokes, is very important me.

4 Likes

Thanks @jaror, and @Ericson2314, I fear we could go on trading counter-examples that demonstrate our own p.o.v. 'til the cows come home. For me, this is about a style of coding (and reading more often than writing) not so much theoretical purity/Church binding. So (again) de gustibus non disputandum, I’m happy if you want to use a different style; just don’t impose your style on me please.

To demonstrate my p.o.v. I would cast the example to avoid anonymous var names more like this:

g :: X -> (X, X)
g (x :: a) = (f X, f Y) where -- error: Could not match expected type 'X' with actual type 'Y'
  f (x :: a) = X -- reuses the existing binding

h :: X -> (X, X)
h (x :: b) = (f X, f Y) where -- works
  f (x :: a) = X -- binds a new variable

Reusing the name x ‘merely’ causes shadowing; yet I find heckofa confusing. (And I showed an example from the wild on the proposal discussion.) So a rule of thumb in my style is:

  • Don’t use same-named vars in the same neighbourhood unless they mean the same thing.
  • So this prohibits the attempt to re-use a within g.
  • By ‘neighbourhood’, I mean something a little wider than Haskell’s scope: suppose I have several equations/with guards to define some function. Even though each equation/each guarded expression is a separate scope (excepting vars introduced via explicit forall in the signature), I wouldn’t blindly name each set of arguments x y z unless each was the same type and meaning across all the equations.
  • Then as with term vars, so with type vars.
  • If I were encoding with @...s I wouldn’t use tyvar a, even though it would be ‘merely’ shadowing.

IOW in neither g nor h would I name f’s term vars x. Then neither would I name f’s type var a.

Firstly why are you naming a term var within a where same as the outer function’s argument? Secondly is there a top-level signature for f? does it use forall a? Is -XScopedTypeVariables switched on? Thirdly is this a the same type as the a in f’s argument :: [a]?

I’d describe that code as pathological/not a style I would use/not a style I would accept from a co-worker. (Did it pass code review?) If you’re suffering code like that in an application you’re compelled to maintain, I pity you. Changing to @... style but retaining the same var and term names isn’t going to help much. I would just rewrite the whole code.

2 Likes

I think that approach is reasonable, however my main complaint is that it is really easy to accidentally give two variables the same name. And it is not easy for the compiler (or other tooling) to tell whether you really intended it or not.

If I now have to make a change in your code (or in fact just review your code), I don’t want to have to inspect the whole neighbourhood (which may be hundreds of lines) to figure out which type variables are already bound.

With explicit binding you can rely on automatic compiler warnings to inform you when you are shadowing a variable. (I agree that in general you should avoid shadowing and turn on those warnings.)

Let’s say you forgot that a was already used in the function’s argument.

Since this discussion is about introducing a new extension PatternSignatures which is a subset of ScopedTypeVariables I’d say we should assume ScopedTypeVariables is turned off. Then I think the other two questions are irrelevant.

Let’s say the person having to answer the pop quiz is the code reviewer. In fact, the outer scope might be hidden if it is untouched by the change that is being reviewed.

System F, which is often used as a theoretical model of Haskell, has explicit binding of type variables using the capital lambda, see System F - Wikipedia.

My familiarity with System F is probably also biasing me in this discussion, but I think there are good reasons besides just following what System F does.

1 Like

It’s definitely an extra parameter! I’ve been programming a lot with DataKinds recently, and it’s essential for my sanity that I can treat type indices as parameters just like any other parameters.

1 Like

@maxigit So the original 1990s Haskell didn’t have higher ranked types and other things which meant that implicit bindings were always very controlled – bindings always had to go in the same extremely predicitable places which meant it really wasn’t worth being explicit.

However, since then we’ve lifted those restrictions, and have most/all of System F (depending on which “System F”). Yes, you are right we have things beyond System F, but that doesn’t weaken the argument. System F and MLTT (the O.G. “dependent types”) both have explicit bindings: with System F, Big Lambda is carefully crafted to look the same way as Little Lambda; with MLTT the two lambdas are combined into one so scoping has to work the same way. There are other things we’ve added like constraints that don’t appear in either, but they don’t change the picture; they are explicitly bound too.

The basic problem as I see is rather than leaping between two “stable design points” (no bindings, unification everywhere) (explicit bindings, no implicit binding, unification more optional), is that we’ve creeped and crawled in baby steps with many language extensions, and landed in this blurry midway point (e.g. with -XScopedTypeVariables) that isn’t really in either camp, and isn’t so coherent.

#448 (with or without my additions) is trying to incrementally get us out of this situation, but what a few us have been thinking is ultimately we probably want to move away from 2 ^ n language extensions, to something which is more coarse grained and geared towards best representing these stable design points — e.g. you have to pick Haskell 98 land where you loose expressive power but things like the order of forall-bound variables are guaranteed not to matter, or you pick full power but more explicitness required mode, where things like that do matter, but also ambiguities that might otherwise arise are made impossible. There is a lot of nuance on how to actually execute on this while being backwards compatible — it will be a long and delicate road getting there ---- but I think it is a good (possible) destination to keep in mind.

3 Likes

The @ is supposed to indicate “optional”, or as we call it “invisible”. Long term, it may become the case that we can have both invisible term arguments (e.g. @5, but the 5 could be inferred) and visible / mandatory type arguments (e.g. sizeOf Int, avoiding the nasty -XAllowAmbiguousTypes).

So to answer your questions:

Yes

The fully explicit type signature, which you would get from GHCi, is:

forall a. a -> a 

the forall binds the variable in the signature, and the @ binds the variable in the term; foralls and @s (once we are being explicit) always line up 1-for-1!

Just to make that last part explicit, the fully explicit term is something like:

(\ @(b :: Type) (x :: b) -> x :: b) :: forall (a :: Type). a -> a

I’ve added all the signatures I’ve possibly can, and used two separate identifier to represent the fact that the a and b and syntactically unrelated (they just happen to mean the same thing, but that’s semantics).


I forgot to quote, I was responding to

I thought you explicitly interested in the concepts / theory and I wanted to be clear this was something that does matter for the user, it isn’t just an implementation detail.

1 Like