Pre-HFTP: Coordination for finishing Trees That Grow

I am very excited for Coordination for structured error messages by goldfirere · Pull Request #24 · haskellfoundation/tech-proposals · GitHub, and I would like to do something similar for Trees That Grow.

Goals

Trees that grow (see implementing trees that grow · Wiki · Glasgow Haskell Compiler / GHC · GitLab, might need to be more up-to-date) is a project to get a single extensible Haskell AST for all our needs. The parts within GHC are mostly done, and this is what remains:

Make ghc-lib-parser a real package

ghc-lib-parser is currently extracted from ghc and contains far more of GHC than it should. We should continue trimming down the AST and parser so they can live in actually standalone package(s), without extra cruft. (ghc-lib-parser could be a single package with both, or it could be just parser and depend on an even more stripped down bare AST package.)

Crucially, GHC itself would depend on these packages, so we maintain the abstraction boundary during GHC development, as opposed to crossing our fingers that this stuff separates cleanly at release time.

Thanks to @mpickering’s work on Multiple Home Units (!6805) · Merge requests · Glasgow Haskell Compiler / GHC · GitLab, we are well on the way to having a single GHCi (or HLS) session load multiple home units, so splitting ghc in this way need not slow down GHC development.

Make template-haskell use the regular AST

Currently template-haskell has it’s own AST. This is bad because it is:

  1. Constantly as risk of falling behind GHC, not being able to represent new features
  2. Is generally a bit lossy/mismatched even when there isn’t new features
  3. Keeping both in sink is just a big time sink.

We should make it use the main AST, and thus avoid these problems

Process

I am thinking this can be “more of the same, with slight tweaks”. Basically as 24 is refined, hopefully accepted, and then tried out, we will learn what works and what doesn’t. Once we’re “over the hump” (we’ve made significant process converting errors, and we have our HLS proof of concept), we can could start this as a very similar project to build on that new experience, try out any tweaks we wish to make, and generally keep busy.

Unlike 24, projects like HLS and hlint already use ghc-lib-parser so the basic idea is already validated and we don’t need new protypes. We just need to get this done and out of the 70% limbo it is currently in.

CC @alanz @int-index

5 Likes

One concern I have is that with error messages there’s a lot of individual components that can be identified and worked on by volunteers one at a time – here, it sounds more like there’s two “big bang” things that need to happen that are much less suited to incrementalization? So the coordinator identifies issues/volunteers take up pieces model (similar in some ways to polymath: https://polymathprojects.org/) seems like it might not fit as well here…

Yes, template-haskell mostly a big-bang I am afraid. (Though one could imagine making template-haskell-ng side-by-side more incrementally.)

ghc-lib-parser, however, is nicely incremental. We have an existing test measuring how many modules the “root” ones depend on, and my experience seeing / working on things like

  1. Separate AST from GhcPass (#18936) (!4778) · Merge requests · Glasgow Haskell Compiler / GHC · GitLab
  2. Draft: Get rid of all `GHC.Tc` imports for the AST types (!4782) · Merge requests · Glasgow Haskell Compiler / GHC · GitLab
  3. Remove dependency from Hs.Expr to Tc.Types (#19932) (!7298) · Merge requests · Glasgow Haskell Compiler / GHC · GitLab

Is it really is a more “death by 10s of cuts” than a single issue.

I’m all for this.

Parser

  • Why is it “ghc-lib-parser” not just “ghc-parser”?
  • I agree that it’s a project we can pursue incrementally… but it needs someone to keep a list of things that need doing, adn push it a long. Maybe a status page, listing outstanding tasks, to make it easier for contributors to know where to start?
  • Are there other tools (perhpas based on haskell-src-exts) which we could bring up to date and make usable with ghc-parser, thus giving it more value?

Template Haskell

  • Yes, a big bang, alas.
  • Quite a lot of TH user never need to look at the TH syntax tree – they just use quotes to generate it; and audit of how many packages actually mention the constructors explicitly would be quite informative.
  • One could imagine a huge pile of pattern synonyms to try to paper over the differences, but I would hate us to try to support that into the future – regard it as an immediately-deprecated transition aid.
  • The full syntax tree is, I think, somewhat more complicated than TH syntax – more data types, more nesting, more constructors. I can’t quantify that, but it’ll impose some pain on TH users.
  • Overall, I think we should consult TH users a lot before thundering ahead on this.
4 Likes

One thing I’m concerned about is tool support and discoverability. Trees That Grow makes it hard to see which constructors and which values there are at a specific stage in the compilation pipeline. Could Hackage somehow support Trees That Grow in a better way? Is there a GHCi or HLS command I can use to see the possible constructors with all the type families resolved?

2 Likes

Given that there’s no particular way to incremementalize the TH component, does it make sense to be part of this proposal?

Edit: btw, would that component of the project mean I got a parser "String → Either Error Splice` effectively, which GHC would provide? Having to go through haskell-src-exts to do antiquotation is a major pain point for quasiquotes and I’d love to see it resolved… I just don’t know if the HF is in a position to make this happen…

1 Like

Glad to hear it!

Oh, that is just what ghc-lib-parser: The GHC API, decoupled from GHC versions is called today. I am fine if we choose a better name.

Yes, that is exactly what the coordinator role (taken from Coordination for structured error messages by goldfirere · Pull Request #24 · haskellfoundation/tech-proposals · GitHub) is for!

Good question. I should go look at that package’s reverse deps to see what sticks out.

These make sense. I have mainly used TH in ways that do use the AST, so I certainly feel the pain!

Perhaps being able to do something like

 f [e| $f $x|] = [e| $f $x |]

would help. Note the [e|, this is not a quote of a pattern, but an expression pattern!

When I was looking at trying to make happy use TH, I did see other ways quoting and slicing could be more flexible too. I should make a separate discussion about this.

@jaror I have some doubt about the current design of TTG too, but I think doing this work will help motivate us to improve those things, as TTG needs to serve more use-cases and gets more eyeballs!

I am still hoping even if the GHC part is big-bang, converting the template-haskell library can be more incremental, but yes I am not wedded to including it. At the very least, we can demote it to just being one potential user among many.

I don’t see why not!

(Though at the same time I think our quasi-quotes are terrible and undeserving of the name, and I rather get something like Rust’s take on Scheme’s pattern based macros, but without the restriction on the body of the macro (so like syntax-case, not syntax-rules` in Scheme). This would mean the macro would just get the expression parsed by the host GHC, including good native errors and speed, etc., and there would be no string to reconstruct anything from.)

I meant that making TTG work better with the existing tooling could be another point on the to-do list. Or do you think that should be a separate effort after the points you list are finished?

Could you clarify, what level of skill is needed to complete these?

I’m fine either way.

I would like to switch to a more open-recursive design, but that is controversial / hard to motivate. The important thing is finishing TTG shouldn’t be viewed as setting the AST in stone. Indeed, once stuff is more modular, it ought to be more easy to make changes.

I think it’s pretty easy. Every time there is a field with a Bad Type Which Is GHC-Specific in the data structure, we replace it with a type family and define I downstream. That’s it, really.

The GHC side of temple Haskell I am not certain about and it’s probably trickier, but the template-haskell library I hope can be built by a GHC that doesn’t agree with it (it just wouldn’t work properly with the extension), which makes that refactor nicely self-contained, though still a slog. We can make it incremental by commenting out definitions that have yet to be rewritten, I suppose, I just doing a bunch of pattern synnoynms up-front that might just exists temporarily.

Could this be the basis for a simpler alternative to Template Haskell and its use of quasiquoting?

The thesis is supported by a new Scheme-like language called Kernel, created for this work, in which each Scheme-style procedure consists of a wrapper that induces evaluation of operands, around a fexpr that acts on the resulting arguments.
This arrangement enables Kernel to use a simple direct style of selectively evaluating subexpressions, in place of most Lisps’ indirect quasiquotation style of selectively suppressing subexpression evaluation.

Fexprs as the basis of Lisp function application:, John N. Shutt.

John Shutt’s work is interesting but…no absolutely not. We don’t really need to innovate with macros at all just yet. We just need to catch up with Racket. Rust has shown how to adapt Racket things to non-sexpr languages. We just need to do what they do, and then surpass them by not requiring that “procedural macros” (aka non-trivial code to splice) doesn’t need to be pushed to external libraries. (We currently don’t require that, so simply by not regressing we’ll do that surpassing.)

After all that catching up with the state of the art, we’ll have “earned” doing research again, and we can hopefully do https://davidchristiansen.dk/pubs/tyde2020-predictable-macros-abstract.pdf which is very good for the highly type driven stuff we tend to use TH for.