Lean: unnecessary complexity/verbosity [was ref Dependent Haskell]

(continuing the length example from that thread …)

Which shifts the dial from entirely unnecessary verbosity to merely annoying verbosity. @atravers reply reduces the verbosity further, but really the worry with Dependent Haskell is the beginner’s experience/on-ramp for learning about types. Haskell as of early '90’s (and Miranda likewise) you could (and you still can) go

myLength []       = 0
myLength (_: ys)  = 1 + myLength ys

And they can use myLength to get sensible answers. I’m presuming Lean doesn’t allow omitting the sig entirely, so taking us backwards more than 30 years. Did I mention Hindley-Milner?

The compiler infers (some alpha-equiv of) myLength :: Num b => List a -> b.

So the learner asks what’s that Num b => business? Learning opportunity 1: use a annotation to get more precise types:

myLength []       = 0 :: Int

Magic! All the results from myLength are to be Int, due to the power of type unification. Learning opportunity 2 (and a good discipline to get in to):

myLength :: List a -> Int         -- essentially atravers' form

I didn’t need some special announcement to introduce term-level ys in the equation above, I just used it in a pattern; then why should I need some special announcement for a tyvar? Just use it in a type. Why do I have to say a : Type? Just assume kind Type for beginners; by-and-by they’ll find out about other kinds.

So this is a language design principle, which Haskell somewhat observes: default unmarked code to what a beginner probably wants; report type errors where that doesn’t work out; make it easy to add simpler and then incrementally more complex annotations/declarations where somebody (no longer a beginner) wants more exotic stuff.

Do not impose all that bondage-and-discipline on beginners. They already have more than enough to get their heads round.

Now my understanding is that DH won’t mess with that principle: the exotic stuff is explicitly opt-in with named extensions.

But my experience with GHC over the past dozen or so years is that even though I haven’t opted-in, the error reporting/suggestions assume I have.

GHC’s current error reporting that suggests a blanket To defer the ambiguity check to use sites, enable AllowAmbiguousTypes . Naive (and not-so-naive) users tend to grab for this suggestion which a) will only move the ambiguity to the usage site where it is more difficult to diagnose; and b) is liable to allow through other ambiguous signatures that were not intended.

Nobody at GHC HQ gets rewarded for improving the learner’s experience; nor for tidying-up developments that have got beyond the experimental phase, but still are held together with barbed wire (OverlappingInstances/FunctionalDependencies) and gaffer tape (TypeSynonyms), with splinters sticking out (DatatypeContexts).

(This might be the same point made another way round.)

From these docos about constructing proofs within the type system:

… a function cons which inserts a new element at the head of a list. What type should cons have? Such a function is polymorphic : we expect the cons function for , bool , or an arbitrary type α to behave the same way. So it makes sense to take the type to be the first argument to cons , …

For “an arbitrary type” – that is, for any type means cons is not just polymorphic but parametric polymorphic in Haskell terms. Then no: it makes no sense for the type to be any argument at all to cons. I want to say not that the term-level arguments to cons “depend on” that type (which is the verbiage used earlier in that section), but rather that the arguments “exhibit” or “instantiate” or “unify to” that type.

It seems that just because sometimes you want to manipulate ‘inside’ the structure of types for Dependent Typing, Lean users are always everywhere paying the overhead of dragging around extra arguments in the surface syntax.

It’s an important feature of Haskell to distinguish ‘parametric’ vs ‘ad-hoc polymorphic’ (aka overloaded). Again it eases learners into the type system that the data structures they first encounter are parametric: Lists, Maybe, Either, …

yeah how it is written there is misleading, they also believe its bad as the same section on the new version is completely rewritten

Hmm? Thanks for pointing out there’s a Lean4 version. Seems to be exactly the same wording “So it makes sense to take the type to be the first argument to cons , …”. There is a revised def for cons, with a superfluous IMO first argument giving the element type.

The later text also seems to be much the same as Lean3:

This is a central feature of dependent type theory: terms carry a lot of information, and often some of that information can be inferred from the context. In Lean, one uses an underscore, _ , to specify that the system should fill in the information automatically. This is known as an “implicit argument.”

It is still tedious, however, to type all these underscores. When a function takes an argument that can generally be inferred from context, Lean allows you to specify that this argument should, by default, be left implicit. This is done by putting the arguments in curly braces, as follows:

So I guess we end up just about the same place as Haskell today, with explicit foralls and type @pplication. I’m not seeing leaner nor cleaner.

I’m struggling to find in the Lean docos a clear example of a function returning a Type (as opposed to returning a value of some dependent type). Down in the weeds of an exercise for type-safe access to a database there’s:

structure NDBType where
  underlying : DBType
  nullable : Bool

abbrev NDBType.asType (t : NDBType) : Type :=
  if t.nullable then
    Option t.underlying.asType
  else
    t.underlying.asType

So the : Type for asType means it returns a type. But that’s an abbrev, not a first class function:

Definitions written using abbrev are always unfolded.

This sounds more like a Haskell type synonym, although asType has some conditional logic/the scrutinee is a type not a term. Note underlying : DBType, so this is a different namespace (introduced earlier in the exercise [**]); t.underlying.asType is therefore ref’ing a different asType defined in that namespace : DBType -> Type. Then:

  • In t.underlying, the . is field access;
  • in t.underlying.asType, the second . is applying function asType, to return a Type.

Then here’s an example usage from earlier, declared using an ordinary def but then it’s returning Bool not Type:

def DBType.beq (t : DBType) (x y : t.asType) : Bool := 
  match t with
  | .int => x == y
  | .string => x == y
  | .bool => x == y

Boolean equality for fields of types available in the database. The scrutinee of the match is t; then the => must refine the types for x, y so yielding typesafe despatch to the appropriate overloading for (==). Since the RHSs of those three => are identical and (==) is already overloaded, it’s not abundantly clear to me why the compiler needs so much bondage-and-discipline hand-holding.

[**] Or a different data structure? declared using keyword inductive, not structure. And I’m not getting why using the constructors from that inductive needs a dot prefix (.int); there’s no dot in the decl.

abbrev is like def, but always unfolded during type-checking. You may still bind higher order functions this way and apply them as well.

The more general pattern where we define a function mapping an ADT into Type is called the universe design pattern: The Universe Design Pattern - Functional Programming in Lean

Edit: note that use of == would require (the equivalent of) an Eq instance. Unless asType is abbrev, you could not solve this instance in each match branch. There is no EQ instance for DbType.int.asType; only for Nat (or whatever).

3 Likes