Deep-transformations and more

I am please to announce four package releases today. From the bottom up:

rank2classes-1.4.1

is a minor update of the library with a couple of bugfixes and adjustments. To summarize, the library exports several classes such as Rank2.Functor or Rank2.Traversable that mirror their namesakes from the base library. Where the base class methods take a function argument of type a -> b, the rank2classes methods expect a natural transformation of type forall a. p a -> q a.

The main purpose of these classes is to support a programming technique often called Higher-Kinded Data, where every record field is wrapped into a type parameter of the record type. If the records in question are recursive and form a tree, however, it turns out that the one type parameter is not enough. Which brings me to

deep-transformations-0.1

As noted above, the methods from rank2classes operate with natural transformations which work equally forall types. In other words, they rely on parametric polymorphism. The nodes of a typical Abstract Syntax Tree, however, belong to a number of different mutually-recursive data types: module, declaration, statement, expression, etc. We often need to apply different transformations to different node types. One might say we need to work with unnatural transformations.

The new deep-transformations library can be described as rank2classes that switch from parametric to ad-hoc polymorphism. The former is represented with polymorphic functions, and the latter as type class instances.

The library comes with support for attribute grammars, among other things, as the new generalization is powerful enough to support them.

grammatical-parsers-0.5

The grammatical-parsers library leans on rank2classes to represent both grammars and parsing results in the form of Haskell records. It comes with a number of parsers with a shared interface and different performance characteristics. Some of them support Parsing Expression Grammars, others Context-Free Grammars. Some are backtracking, others are memoizing, and two of them support left-recursive grammars.

Version 0.5 comes with several major changes. A few foundational classes have been moved to the input-parsers library. There are two new parser-transformers that allow threading of a user-defined monad. See the change log for details.

language-oberon-0.3

The language-oberon package provides a library and executable for the Oberon programming language. Release 0.3 uses the new functionality from grammatical-parser and deep-transformations to implement some new features. In particular, there’s constant
folding
implemented as an attribute grammar, and the parser nows uses the left-recursive parser-tranformer to store all parsed lexemes in the AST. This enables reconstruction of the original program, with all whitespace and comments preserved.

One thing of note about this library is that it uses the finally tagless technique for constructing and re-constructing the AST. The purpose of this architecture was to enable code reuse, which leads me to

language-Modula2-0.1

This is a new package that implements a parser, re-serializer, constant folder, and pretty printer for the Modula-2 programming language. It depends on all the libraries listed above. In particular it reuses several parts of language-oberon, starting with some of the AST node declarations. Consider it a proof of (reuse) concept.

In the future I hope to push the same concept further and add more languages while reusing more code. On the other hand, I may decide to circle back and implement code generation and/or transpilation for the existing libraries. I haven’t decided yet, and I could use some input. Some help would be even better.

3 Likes

Wow, that looks impressive. I didn’t know there were other attribute grammar EDSL libraries besides AspectAG. Although, it looks like integrating attribute grammars into Haskell requires some really burdensome overhead, so I think I will keep using UUAGC as a preprocessor. How did you experience writing those language implementations with your library? When writing in UUAG I already wonder sometimes if it really makes writing (and reading) compiler implementations that much easier.

Thank you for the compliment.

UUAGC syntax is definitely lighter, just compare its MinRep implementation with mine. I have used UUAGC in production and I like it. My two complaints are that it’s not modular – I at least couldn’t find a way to combine two attribute grammars into one – and that it insists on its own node type declarations, when it’s often much more practical to reuse existing Haskell-declared data types.

One possible project I may undertake is to write a UUAGC -> deep-transformations translator. Do you think anyone would use it?

I have not tried the AspectAG library, I was frightened off by the type signatures in their examples. I believe I have improved at least that part of the experience. Don’t be fooled by the long list of constraints in the language-oberon attribute grammars: they’re there only to allow the reuse of the attribute grammar rules by other languages. A typical single-language use case would have much shorter type signatures.

Are attribute grammars worth it overall? I don’t know, that’s one of the questions I’m trying to answer with this project. The only credible alternative foundational technique I’m aware of is the F-Algebras / recursion schemata. It’s quite elegant and theoretically sound if your tree has only a single node type, even though there are some doubts if it’s all worth it. The trouble is with that single node type constraint. One solution is to merge all node types into one giant GADT. Personally I find that distasteful and limiting but Cubix is a proof that it works for some applications.

1 Like

My two complaints are that it’s not modular – I at least couldn’t find a way to combine two attribute grammars into one –

I think the idea with modularity in UUAG is not so much combining two grammars into one, but more extending one grammar without having to modify the original code. I have been told that they had written a language with extensions in different files and then they could choose features by simply including the appropriate files. Or is that what you mean by combining two grammars?

and that it insists on its own node type declarations, when it’s often much more practical to reuse existing Haskell-declared data types.

Yeah, I also think that this reduces reusability. But, I don’t think this is a fundamental limitation of UUAGC. I think it is a consequence of using UUAGC as a preprocessor. And, doesn’t your library also require special HOAS-style data types?

One possible project I may undertake is to write a UUAGC -> deep-transformations translator. Do you think anyone would use it?

I think UUAGC itself is only used by very few people, so I don’t expect that there will be many users. I also think compiling directly to optimized Haskell will produce faster programs, so personally I don’t think there is a big reason to switch.

It’s been a while and I forgot most of the details, but: while you can have one attribute grammar import (or rather include) another, the generated Haskell module will be a monolith. So you can create a chain of attribute grammar imports, but not a diamond.

Yes, it puts some constraints on the Haskell declarations it can support, but there’s still plenty of variations it will tolerate. It doesn’t insist on generating declarations in its own style.

I’m afraid you’re right. Also UUAGC doesn’t seem to be supported by the University of Utrecht any more. I can’t find it in their Web pages, and the old links are dead.

1 Like

Yeah, I get what you mean. It is only modular from the UUAG point of view, not from the Haskell point of view. Another price that you pay for the preprocessor nature of UUAGC. I wonder if language extensions of Racket can avoid these kind of issues.

EDIT: Actually, you can think about it another way: consider UUAG as a separate language that happens to compile to Haskell. The reason that it isn’t modular is because there is no UUAG equivalent of Cabal and Hackage. If such tools were to be made then you could exchange and combine different UUAG packages.

UUAGC is still used in the Helium compiler from Utrecht and it is maintained by Jeroen Bransen, a former PhD from Utrecht, he merged two of my pull requests recently (and that only took about a month). Their wiki server has been shut down but the content is still somewhere waiting for somebody to move them to another place.