Need feedback on pattern signature behavior

@maxigit So at the end of the day this a poll, you can have what ever opinion you like! I just want to make sure we are at least understanding each other, not that we agree necessarily :slight_smile:

Why can’t the scoping rules be the same for type and value ?

From my point of view, that is what I am trying to do: fix the rules so they are the same!

That the bit I really don’t get. How is it a problem in practice ? In (x :: a) = x :: a, that doesn’t pose a problem to anybody to have to look at the first x to understand the second x, so why is it a problem for the as ???

Let’s step back a bit.

  1. When I read the x or the a to the right of the =, I can no for sure this is a variable use. My brain goes "OK, new variable, better go find out what it refers to, or remember it for when I encounter the binding later.

  2. When I likewise encounter the x to the left of the =, I likewise can go “OK, found where x is defined, I don’t need to keep on looking”.

  3. When I encounter the a to the left of the =, I can make no such call. Sure, I’ve found something that is of type a, but I have no idea what a is — further out there could be other usages, or something that fully constrains it, like a ~ Int. I have no choice but to keep on searching to learn more about a.

Again as @jaror says, scale matter. id (x :: a) = x :: a is fine, but

f (SomeGadt :: Foo a) (x :: [a]) = 
    ... -- 30 lines!








  where
    ... -- 30 more lines!









    x :: Maybe a = ... -- pop quiz, is this `a` a bind or a use? 

is a lot harder

So that’s what I don’t like about implicit binding: reading left-to-right/outside-in, it’s alright, but reading right-to-left/inside-out, it’s a lot harder. And most of what I do is reading code that I did not write, random accessing all over the codebase (within modules and between modules). Therefore making reading as easy as possible, even at the expense of a few keystrokes, is very important me.

4 Likes

So do I. I am just confused wether the behavior I would like is “implicit bindings” or not possible at all ?

Thanks @jaror, and @Ericson2314, I fear we could go on trading counter-examples that demonstrate our own p.o.v. 'til the cows come home. For me, this is about a style of coding (and reading more often than writing) not so much theoretical purity/Church binding. So (again) de gustibus non disputandum, I’m happy if you want to use a different style; just don’t impose your style on me please.

To demonstrate my p.o.v. I would cast the example to avoid anonymous var names more like this:

g :: X -> (X, X)
g (x :: a) = (f X, f Y) where -- error: Could not match expected type 'X' with actual type 'Y'
  f (x :: a) = X -- reuses the existing binding

h :: X -> (X, X)
h (x :: b) = (f X, f Y) where -- works
  f (x :: a) = X -- binds a new variable

Reusing the name x ‘merely’ causes shadowing; yet I find heckofa confusing. (And I showed an example from the wild on the proposal discussion.) So a rule of thumb in my style is:

  • Don’t use same-named vars in the same neighbourhood unless they mean the same thing.
  • So this prohibits the attempt to re-use a within g.
  • By ‘neighbourhood’, I mean something a little wider than Haskell’s scope: suppose I have several equations/with guards to define some function. Even though each equation/each guarded expression is a separate scope (excepting vars introduced via explicit forall in the signature), I wouldn’t blindly name each set of arguments x y z unless each was the same type and meaning across all the equations.
  • Then as with term vars, so with type vars.
  • If I were encoding with @...s I wouldn’t use tyvar a, even though it would be ‘merely’ shadowing.

IOW in neither g nor h would I name f's term vars x. Then neither would I name f's type var a.

Firstly why are you naming a term var within a where same as the outer function’s argument? Secondly is there a top-level signature for f? does it use forall a? Is -XScopedTypeVariables switched on? Thirdly is this a the same type as the a in f's argument :: [a]?

I’d describe that code as pathological/not a style I would use/not a style I would accept from a co-worker. (Did it pass code review?) If you’re suffering code like that in an application you’re compelled to maintain, I pity you. Changing to @... style but retaining the same var and term names isn’t going to help much. I would just rewrite the whole code.

2 Likes

Just for normal Haskll users (not versed into the deep knoweledges of GHC).
Has this “bind vs use” thing any theoritical fundation, or is it just a GHC implementation leake.

Until now, I have never though in term of binds and use. I just put a where needed with the understanding that this a is the same everywhere (within appropriate scoping rule).
Whether the first use is seen as a bind, or the whole thing is seen as a set of constraints to be solved (a bit like functions type are resolved) should be irrellevant to the end user, shouldn’t it ?

1 Like

I think that approach is reasonable, however my main complaint is that it is really easy to accidentally give two variables the same name. And it is not easy for the compiler (or other tooling) to tell whether you really intended it or not.

If I now have to make a change in your code (or in fact just review your code), I don’t want to have to inspect the whole neighbourhood (which may be hundreds of lines) to figure out which type variables are already bound.

With explicit binding you can rely on automatic compiler warnings to inform you when you are shadowing a variable. (I agree that in general you should avoid shadowing and turn on those warnings.)

Let’s say you forgot that a was already used in the function’s argument.

Since this discussion is about introducing a new extension PatternSignatures which is a subset of ScopedTypeVariables I’d say we should assume ScopedTypeVariables is turned off. Then I think the other two questions are irrelevant.

Let’s say the person having to answer the pop quiz is the code reviewer. In fact, the outer scope might be hidden if it is untouched by the change that is being reviewed.

System F, which is often used as a theoretical model of Haskell, has explicit binding of type variables using the capital lambda, see System F - Wikipedia.

My familiarity with System F is probably also biasing me in this discussion, but I think there are good reasons besides just following what System F does.

1 Like

That is a weak argument. Do you also forgot that variables you already used ?

Probably, as my unfamiarity with it biaise me in the other direction. Anyway , I thought we deviated from System F a while ago :slight_smile:

Above all, I just find the explicit parameter syntax ugly. It looks like an extra parameter but it is not.
I would much prefer (x :: @a) = x :: a (or in the type signature) but it seem extra noise for nothing.

It’s definitely an extra parameter! I’ve been programming a lot with DataKinds recently, and it’s essential for my sanity that I can treat type indices as parameters just like any other parameters.

1 Like

When using or defining function ? If I write id @a (x :: a) = x :: a, I can still call id "hello" without having to do id String "hello" (or id @String "hello"), can’t I ?.
Moreover the type signature is still a -> a or is (a :: Type) -> a -> a ?

@maxigit So the original 1990s Haskell didn’t have higher ranked types and other things which meant that implicit bindings were always very controlled – bindings always had to go in the same extremely predicitable places which meant it really wasn’t worth being explicit.

However, since then we’ve lifted those restrictions, and have most/all of System F (depending on which “System F”). Yes, you are right we have things beyond System F, but that doesn’t weaken the argument. System F and MLTT (the O.G. “dependent types”) both have explicit bindings: with System F, Big Lambda is carefully crafted to look the same way as Little Lambda; with MLTT the two lambdas are combined into one so scoping has to work the same way. There are other things we’ve added like constraints that don’t appear in either, but they don’t change the picture; they are explicitly bound too.

The basic problem as I see is rather than leaping between two “stable design points” (no bindings, unification everywhere) (explicit bindings, no implicit binding, unification more optional), is that we’ve creeped and crawled in baby steps with many language extensions, and landed in this blurry midway point (e.g. with -XScopedTypeVariables) that isn’t really in either camp, and isn’t so coherent.

#448 (with or without my additions) is trying to incrementally get us out of this situation, but what a few us have been thinking is ultimately we probably want to move away from 2 ^ n language extensions, to something which is more coarse grained and geared towards best representing these stable design points — e.g. you have to pick Haskell 98 land where you loose expressive power but things like the order of forall-bound variables are guaranteed not to matter, or you pick full power but more explicitness required mode, where things like that do matter, but also ambiguities that might otherwise arise are made impossible. There is a lot of nuance on how to actually execute on this while being backwards compatible — it will be a long and delicate road getting there ---- but I think it is a good (possible) destination to keep in mind.

3 Likes

The @ is supposed to indicate “optional”, or as we call it “invisible”. Long term, it may become the case that we can have both invisible term arguments (e.g. @5, but the 5 could be inferred) and visible / mandatory type arguments (e.g. sizeOf Int, avoiding the nasty -XAllowAmbiguousTypes).

So to answer your questions:

Yes

The fully explicit type signature, which you would get from GHCi, is:

forall a. a -> a 

the forall binds the variable in the signature, and the @ binds the variable in the term; foralls and @s (once we are being explicit) always line up 1-for-1!

Fair enough, but we write Haskell. System F syntax or notation is somehow irrelevant for the non type theorist.

Just to make that last part explicit, the fully explicit term is something like:

(\ @(b :: Type) (x :: b) -> x :: b) :: forall (a :: Type). a -> a

I’ve added all the signatures I’ve possibly can, and used two separate identifier to represent the fact that the a and b and syntactically unrelated (they just happen to mean the same thing, but that’s semantics).


I forgot to quote, I was responding to

I thought you explicitly interested in the concepts / theory and I wanted to be clear this was something that does matter for the user, it isn’t just an implementation detail.

1 Like

Ok got it.

I have one more question then. If the signature is forall a. a -> a then the previous code should be strictly equivalent to (x -> x) :: forall a. a -> a. So will both version accept the optional parameter even though it doesn’t appear anywhere in the second version ?

I am interested in the concepts / theory but not when I am actually programming (i.e. I don’t want to have write @ everywhere just to satisfy my theorist side).

That the bit I I still don’t understand but I’ll trust you on that. If it is necessary, then it is. I’m still voting for implicit parameter and I think you understand my reasoning.

1 Like

I think you meant (\x -> x) :: forall a. a -> a? Yes it is equivalent, one is just more “filled in”.

Not sure what you mean, do you mean \ @_ x -> x? That might be accepted in some contexts. There are some type checking questions unrelated to name resolution to work out, however.

I am interested in the concepts / theory but not when I am actually programming (i.e. I don’t want to have write @ everywhere just to satisfy my theorist side).

OK!

That the bit I I still don’t understand but I’ll trust you on that. If it is necessary, then it is. I’m still voting for implicit parameter and I think you understand my reasoning.

I just want to emphasize that optional != leaky implementation detail. There is no real upper bound to how much of your program a clever language implementation might be able to figure out / guess on it’s own. The point of this explicit stuff isn’t that GHC is lost and confused, but that the human might be lost an confused. Of course, this is a matter of opinion.

It is fine if you continue to like it :).

I was refering

and

id2 \ @a (x :: a) -> x :: a looks like it has 2 parameters. If I understand @tomjaguarpaw, correctly, it is a parameter. I can call it with 2 parameters (e.g. id2 @Int 3).

But the parameter is optional, so I don’t have to provide it: id2 3 works as well.
Fair enough, but id2 signature is only forall a . a -> a, which only have one parameter. How come a function can have 2 parameters (or 1 and half) yet is type signature show only one ?

If \ @a (x :: a) -> x :: a as type forall a. a -> a. Then it is not any different from id1 = \x -> x. id1 and id2 have the same type signature and same code. Therefore I should also be able to call id1 with 2 parameters (id1 @int 3) even though id1 never mentioned any optional parameter ?

What am I missing ?

forall a. is a parameter. To elaborate, the full type is forall (a :: Type). a -> a, which is why when calling id2 you first (optionally) apply a type, and then a value of that type.

Therefore I should also be able to call id1 with 2 parameters (id1 @int 3) even though id1 never mentioned any optional parameter ?

You can.

1 Like

Seems to me there’s a difference here between two schools of thinking. There’s an older school in which -> is a type constructor, => is syntax used to introduce a constraint, forall _. is a universal quantifier, and all of these things are distinct and unrelated ways to form types out of simpler types. The word ‘application’ is strictly used to denote a term combining something that can be subsumed by a type formed with the -> type constructor, and something that can be subsumed by the first argument to that type constructor. ‘Type applications’ are their own thing, analogous but conceptually separate. Instance resolution isn’t even spoken of in the same breath.

Then there’s a newer school, influenced if not driven by dependent type theory, in which ->, =>, forall _., and the prophesied forall _ -> are all ‘quantifiers’ with different properties like ‘visibility’ and ‘relevance’. Those properties determine whether and how one may provide a parameter when using a term of that type in order to eliminate the quantifier. For some combination of backwards-compatibility and convenience, the forall _. quantifier can be inferred if not explicitly present in a type, but regardless of whether it is inferred or explicit, it introduces a parameter (invisible, irrelevant) just like -> introduces a parameter (visible, relevant).

One doesn’t need to learn dependent type theory to switch gears into the new school (‘dependence’ is a property orthogonal to ‘visibility’ and ‘relevance’, AIUI), but it does represent a shift that was pretty surprising to me when I learned about it, and people educated under the old school might need some pointers when people who have adjusted to the new school use the word ‘parameter’ in its newer, more expansive sense.

Using @ to explicitly bind type parameters is definitely of a piece with the new school—all parameters are parameters, and the @ is just a visibility marker—and I support it, but I’m also not surprised that people who haven’t made the shift find it jarring.

3 Likes

@rhendric I think that’s exactly right. I want to make the more limitted-XPatternSignatures better for the “new school”, and I don’t think this should adversely harm the “old school” because they can just continue using -XScopedTypeVariables.

I think it needs older school [**]: -XPatternSignatures as was before -XScopedTypeVariables came along.

XScopedTypeVariables is a terrible extension because it makes explicit forall no longer a merely stylistic choice/its effect is exactly counter to your own rule 2. “the rules are strictly local, no outer scopes / enclosing code needs to be consulted …”. For nearly all extensions, merely switching them on does not change the meaning of an existing program: it enables/allows syntax and that’s what brings in the new meaning. (So @goldfirere makes a good point on the github thread that a warning/linter at the point of that new syntax would work as well. How’s a warning going to detect that the signature uses explicit forall and the introduced tyvars are used in the body of the function – being careful not to warn if they’re shadowed ?)

[**] Edit: Or perhaps I mean Scoped Type Variables as documented here – no mention of explicit forall.

I am only using the word parameter as a synonym for “function argument”. If parameter means something different then indeed it is a source of confusion. Is there somewhere a clear definition of “new school” parameter ?