Isn’t it easier and more efficient to write interpreter of pure AST instead of compiling source to combinators? Are there any practical reasons?
I guess this is a good question for @augustss
I asked this at ZuriHac last year. From what I remember the answer was that it would probably not be much harder and might even be more efficient, but he chose combinators for old times’ sake (the earliest compilers for lazy functional languages used combinators, notably David Turner’s Miranda) and because it is also a proven simple approach.
There’s also a big question of whether you write the interpreter in C or in Haskell. The former will be harder, but makes bootstrapping easy and can perhaps get you more performance.
It’s correct that the use of combinators is mostly a historical accident.
But that said, I think it’s easier to translate lambda calculus to combinators and then have combinator interpreter. Combinators are very simple and easy to implement. The great advantage is that there are no variables, so there is no complicated handling of environments. You also get full laziness for free.
So unless I’m going to generate native code, I would use combinators again. I think it’s easier and as efficient. But I’m happy to be proven wrong. MicroHs is there for anyone to experiment with.
Turner’s original paper on the topic, A New Implementation Technique for Applicative Languages, is a delightful read. Very approachable, and easy to get started with. And to stray a little off topic, I’d also recommend the Lambda Days memorial session for David Turner (feat. Lennart giving a good demo of translating lambdas to combinators, running the evaluation, and some MicroHS-on-microcontroller).
Don’t combinators lead to exponential blow up of term size in some cases?
Not exponential but quadratic if you do it well. But yeah, that is still a disadvantage.
Oleg Kiselyov wrote a paper showing how to do it linearly by using bulk combinators but those are not technically a finite set of combinators and he compares to De Bruijn indices using unary numbers which is also a bit sketchy. And I don’t think MicroHs uses this approach.
Theoretically there are bad cases, but they don’t happen in practice. The bad case is deeply nested lambdas, but that’s not common.
Just the optimization you get from using B and C is enough to make sizes reasonable.
I tried using Oleg’s combinators, but it was slower. To be fair, I didn’t put very much optimization effort into it.
I just cloned the repo, but the >7000 line eval.c file is pretty intimidating. That said, I’d think it should only require adding a variable node and a lambda node, so maybe it is not that much work?
The combinator reduction code fits on a screen. The other 7000 lines are primitives you need no matter what.
If you want to experiment, you can get the lambda expressions from the compiler, before the bracket abstraction has run. You only need to implement lambda calculus + int arithmetic to run simple programs.