Is there a ready-made way to convert between Template Haskell and GHC's internal syntax?

I am looking for a nice way to define values of Haskell syntax in Haskell.

It seems it would be ideal to have an inlay of the Template Haskell syntax into the internal Haskell syntax of GHC.

Has anyone defined such an inlay already?

One possible definition would be to print a value of the Template Haskell syntax and then parse it with GHC. But parsing stuff with GHC is hard. Also, it will introduce spurious partiality: we know that the Template Haskell syntax is a subset of the internal Haskell Syntax of GHC, so it will always be parsed successfully, but this knowledge is lost when we print and then parse.

4 Likes

If you haven’t already seen this, you might find this project interesting. It uses GHC’s parser to generate internal syntax, and then turn that into TH syntax. Maybe it can be adapted to do what you’d like?

1 Like

If all you want to do is run the parser, there now is parser.

-- | A pure interface to the module parser.
--
parser :: String         -- ^ Haskell module source text (full Unicode is supported)
       -> DynFlags       -- ^ the flags
       -> FilePath       -- ^ the filename (for source locations)
       -> (WarningMessages, Either ErrorMessages (Located (HsModule GhcPs)))

If you want to parse expressions instead of whole modules, you will need to copy and tweak parser.

If you want to typecheck as well, I suggest you have a look at https://discourse.haskell.org/t/code-snippet-how-to-compile-a-string-to-an-optimised-coreexpr-using-the-ghc-api/8490/3; you can tweak this (probably by copying the definition of compileToCoreSimplified, sadly) to get a type-checked expr.

I’m quite doubtful that TH will ever get the ability to introduce new imports, because that would mean we can’t do module dependency analysis separately from parsing and type-checking the whole module.

Perhaps you simply want a templating engine for pretty-printing Haskell code? What’s so bad about using an established pretty-printing library then?

3 Likes

What I want to do overall is find a great solution for generating big quantities of Haskell code from some kind of a specification, say OpenAPI. So, I am looking at various options. As far as I know, the state of the art as of now is to build most of the code with Template Haskell, print that, and then glue everything together and add stuff like imports and exports as hard-coded strings. This can be made to work but it is not what I should take to be a great solution.

The obvious type to pretty print is the internal Haskell syntax of GHC — if I recall correctly, ormolu switched from haskell-src-exts to this, so they must be thinking it good, and I trust their judgement. However, there is the question of constructing values. This means all kinds of values — from simple expressions up to whole packages. The obvious way to construct simple expressions is the fancy brackets of Template Haskell. GHC will make sure for me that these expressions are healthy, and ⟦x + 1⟧ is much easier to read than something like InfixE (Just (UnboundVarE x)) (VarE GHC.Num.+) (Just (LitE (IntegerL 1))).

So where do I get DynFlags? [It was a problem some years ago.] There is no indication of how and whether this problem was solved.

Plausibly I can take an expression from Template Haskell and make it up into a trivial definition. A standalone definition is a valid module, so it should get parsed.

I have not looked at printing closely yet because I await it to be rather easy — what seems hard to me is giving the working human an easy way to define values of Haskell syntax. However, I am eager to hear your suggestions on printing as well. What libraries do you have in mind?

This is kind of how I imagine it to work, but in the wrong direction. We can see the scope — it takes about 600 lines of code to pattern match on all the possible constructors and put them into other constructors. This is exactly the work I hope to dodge.

1 Like

In that ghc-meta project I also included some very simple code to parse strings using GHC:

That doesn’t require the full dynflags.

1 Like

What is the appeal of using template haskell for this? TH is about generating Haskell code and then compiling it. You only seem to use it as an ergonomic quasiquoter [| x + 1 |]. Which is fair enough, given how easily available TH is. That makes it a simpler WYSIWYG solution than to integrate with a proper templating library, especially given that GHC does not (yet) do multiline strings.

Although I do wonder whether you actually do get what you see. Does pretty-printing TH preserve source locations? A.k.a, do your templates round-trip? I have no idea. I would not expect it to.

As far as imports is concerned, I don’t think writing ppr [| import Foo |] is all that better compared to "import Foo". So the “incompleteness” of TH does not seem all that limiting to me.

Either way, using the TH AST or the GHC AST for this purposes seems… daring. I would rather use interpolation for this purpose. I see no reason why you would need to parse Haskell in your use case.

GHC will make sure for me that these expressions are healthy

I think this point is hugely oversold. You can easily test whether you are generating syntactically correct Haskell by compiling your outputs.

If you really want, you could fork interpolation to run parse over the interpolated string. I would not spend time on that unless it is mission critical that your software always produces parsable code and you can’t exhaustively test the code paths that generate the code.
Given that you expect TH to detect bugs for you, you can just aim for 100% test coverage of code (rather than paths) and be good. Do note that parseable doesn’t imply compilable either, unless you use typed template haskell (even more churn). Ultimately what will hold you back is what kind of code you generate, not whether it is well-formed.

1 Like

Thank you Sebastian. This is good advice. I have not made my mind up yet as to whether I am better off using templates like those of interpolation or like those of Template Haskell.

There are three reasons I was thinking of Template Haskell:

  • I already have some working (though terribly complicated) Template Haskell code.
  • I saw some examples of people using Template Haskell to generate code at the scale I aim at, this for example. This stuff seems to be more robust than the relatively fragile «string with holes» solutions.
  • Lists (and, in particular, strings) lend themselves well to «regular» problems, but I am not sure if my problems will be mostly «regular». If I get a «context free» problem, it would be better to have my data in the form of a tree.

interpolation and similar «string with holes» kind of solutions are also something I keep in mind.

Those are good points in favor of TH. I still don’t think that you need to wrangle with GHC’s parser etc. just to reliable encode a string such that it can be put into a Haskell string literal. That’s just lexical syntax, and I bet you could always just call the corresponding escaping function in GHC directly (or go through TH just for that specific case.)

I don’t really buy the “context free” problem, though, because it will be much harder to build a TH expression directly, without quotes (otherwise, why not use interpolation for the quote?).

I think you could just ask the authors of that OpenAPI package directly for why they chose TH and if they would do so again.

1 Like