Grin compiler funding

I’m not the author of grin compiler. I noticed asterius going dead so it’s the grin compiler that’s left that’s looking to do wasm. Currently the author seems to be playing with profiling and debugging tools. The author has patreon page.

1 Like

Please let’s not spread misinformation. Asterius is anything but dead.

5 Likes

Somewhat kicking. Their roadmap is outdated and possibly unfulfilled. Half the reason I’m proposing is that grin could probably be applied to purescript and that has more libraries.

Could be good to link to concrete documentation on these claims so that third parties have something to refer to.


The roadmap for asterius is outdated on docs website and GitHub. The last commit on repo is like half a year ago. The grin compiler does things besides wasm, which supposedly now includes profiling. As for the serverless it looks to reduce bundle size, or front end for that matter.

I’m seriously confused by the negative tone towards Asterius here. Did you maybe try to contact the developer? Or are you judging purely by what you could glance at in a minute?

In any case I’m not opposed to a call for funding Grin if the HF has funds to spare. I think Csaba is doing an amazing job exploring different approaches.

4 Likes

Regarding Asterius: https://twitter.com/ShpadoinkleUI/status/1423357881479581698

2 Likes

+1 this, for what it’s worth.

I was always curious what prominent Haskellers and GHC hackers feel about GRIN. It seems like good ideas with a lot of potential promise (although presently at research level), but it doesn’t seem to get any attention in the community. Are there catches I’m missing?

I personally feel the ideal GHC pipeline would be:

Core --> Strongly typed version of GRIN --> Rust MIR, with types matched as closely as possible, and the runtime writtten in Rust.

My internship supervisor at Utrecht University said to me once that the concept behind GRIN, whole program optimisation, is unworkable in practice. Although it was a retelling of the experience of one of the PhD students working on UHC in his department, so I really want to ask him again which PhD student it was to ask them about their experience directly. It is unfortunate that negative results disappear like this in academia. I would like to know why GRIN has fallen out of use, why are all the Haskell compilers that use it (UHC, JHC) currently unmaintained? Why did GHC win in the end? Is it a kind of Linux vs Hurd story where GHC, like Linux, is the quick and dirty solution and GRIN, like Hurd, the nice theoretical solution which takes much more time to develop?

1 Like

That’s an interesting perspective.

It is unfortunate that negative results disappear like this in academia.

:100:

My personal theoretical/idealist perspective is that whole program optimisation (WPO/LTO) is neccesary to improve the performance of Haskell and similar languages beyond a certain plateau. I look at high-level, non-imperative languages as recipes for creating programs rather than directly executable models.

The empirical perspective as to the difficulty is very valuable, bearing in mind it’s a single datatpoint.

Nevertheless, Csaba’s project is very valuable and worth supporting, for trying different things, and working out what works, as well as what doesn’t.

I wonder why s/he said that? It may be because GRIN requires whole-program compilation (WPC), and WPC is hard to pull off – you may be compiling thousands of modules simultaneously.

But not impossible. Csaba Hruska is working on precisely that, I believe with GRIN in mind. (See several blog posts on that page.)

Csaba, do you have an update?

Simon

I’m well aware of Csaba Hruska’s work, I just think that GRIN has been tried at least twice (UHC & JHC) already and both compilers are now mostly abandoned (of course not necessarily due to reasons related to GRIN). I also found these performance numbers from 2010: https://web.archive.org/web/20160616121712/http://mirror.seize.it/report.html, which seem to indicate that even in 2010 GHC was usually faster than its competition, only JHC outperformed GHC in a few runtime benchmarks, but compile times were much worse (about 20x on average for JHC). So, I’m not very optimistic about GRIN.

Edit: I have greatly overestimated the age of both UHC and JHC, turns out UHC was only really released in 2009 and JHC in 2005, so that may explain these performance numbers, GHC had a 13-17 year head start. It is a shame that now in 2021 these projects have already started to fall out of use.

But of course everybody can choose for themselves what to spend their time on and I love that people are willing to spend their time on improving the performance of functional programs. I’m mostly just looking for an answer to the question why GRIN hasn’t taken off yet, also because I might want to contribute to similar projects in the future and I’m trying to figure out which research avenues are worth my time.

@simonpj maybe you are actually able to give some insights into the converse question: how have you been able to make GHC so popular? Was it just the first compiler that was usable enough for some people? Or were you more open with accepting external contributors? I feel like UHC and Helium are very difficult to contribute to if you don’t know somebody “on the inside”. I also hear GHC has very good documentation, maybe that played a role?

If you haven’t read the “being lazy with class” paper it may help fill in some history for you: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/history.pdf

In my own opinion, at a certain point it just became clear that GHC was the dominant compiler and there were few enough haskellers and the network effects were such that people increasingly wrote packages just supporting it alone (especially because it ran ahead first on various “practical” features, making it the least purely researchy) that the process became self-reinforcing. At a certain point, no other project could muster the sustained resources to be a competitive industrial strength compiler, and so they just developed based on a few individual’s interest and research needs.

We’re now in a new generation of things with a lot more widespread and serious usage of Haskell, and what we’re seeing increasingly are projects that try to leverage ghc but also build next-to or on-top-of-it so they don’t have to reimplement every aspect of an increasingly rich language. Not every one will succeed, but it seems very promising, and I’m quite happy that ghchq is making good efforts to support this.

6 Likes

It’s always hard to give a good answer to questions like this, because popularity (of a language or a compiler) is always a complex mix of social and technical matters. The paper @sclv refers to is certainly a good baseline: it is explicitly about the genesis and history of Haskell. Once established there is certainly a very strong network effect, as @sclv says – a strong implementation leads to more users, which leads to more contributors, which leads to a better implementation. Getting that cycle started is as much luck as good judgement, and I pay tribute to the myriad others who have contributed to GHC. It is a shared success.

6 Likes

In case anyone’s wondering where Asterius is going, here’s a more detailed writeup: https://github.com/tweag/asterius/blob/TerrorJack/ghc-8.10/docs/roadmap.md#2021-q3

9 Likes

Urban Boquist’s PhD thesis has multiple ingredients:

  • GRIN intermediate representation (IR) that can distinct register and memory operations (STG can not, Cmm can but it is not functional and not pure)
  • whole program defunctionalization
  • call context insensitive static analysis (HPT = heap points-to)

I assume people mean all these design decisions on GRIN. UHC and JHC (originally) also used the Boq GRIN design. However it is important to handle these components independently and do or at least imagine experiments with alternative design decisions.

I’ve learned a ton of things during the development of the grin compiler project.
The most important lesson is to have an easy to use compiler laboratory that allows quick experimentation with various kinds of compiler pipeline and compilation techniques. To do this I need access for the whole program IR. This alone requires a whole program compiler. So whole program analysis and optimisation remain a compiler pipeline design decision.
Another important decision is to narrow the scope of the project as much as possible. That is why I’m not interested in the surface language at all. I’m only interested in the compilation of a lazy functional intermediate representation (IR). This decision practically halves the complexity of the problem and saves a lot of mental energy. IMHO it was a big mistake that UHC and JHC tried to implement both the compiler frontend and backend. Maybe this is a problem in the case of GHC also.

I’ve explored the source code and documentation of the alternative Haskell compilers in depth, because I did not want to make the same mistakes or redo the same experiment, at least not in the same way. IMHO UHC and JHC used the knowledge of the lazy functional compiler research but have not touched the ideas that come from other parts of PL/compiler research.
They narrowed their scope only to functional programming or just lazy functional programming. This is a mistake, because at the low-level there are many similarities between programming language compilation problems.
I.e. everyone wants to predict the program’s control flow graph, and wants to replace indirect calls with direct calls. There are plenty of ideas and approaches to handle these problems.
My goal was to study the problem in the largest scope and read all related material, no matter what language it was created for.

I wonder why no one mentioned the HRC/FLRC project (Intel Labs Haskell Research Compiler, Functional Language Research Compiler) IMHO it is the closest relative to the GRIN approach. Maybe it is not obvious at first due to the different naming and terminology. Its core idea is the same: use a low level IR for optimization that can express structured memory operations.
HRC/FLRC exceeded GHC compiled programs in terms of runtime performance. HRC used GHC as a Haskell frontend via GHC 7.X’s external-core IR.

BTW whole program analysis and optimization is certainly possible, because MLton does that. Also link time optimizations (thin LTO) is quite a popular topic in the GCC and LLVM community. But I’m interested in any approach. In fact, more recently I added a runtime call graph builder feature to the external STG interpreter. (https://twitter.com/csaba_hruska/status/1421116379432882178)
It turned out that most of the live closures were called only a very few times (1-10). Even in complex programs. And there is a small/moderate sized cluster in the call graph that is running dominantly.
So my newest compiler/optimizer pipeline design approach is to do a profile guided optimizing compiler. I.e. the program runtime profile could tell the optimizer which closure should be inlined or cloned and specialised.
I also plan to implement an escape analysis to replace the whole program points-to analysis. BTW HRC/FLRC does not require whole program analysis either. Currently I read papers and phd theses about how escape analysis is done in the Java/JVM world. It is an excellent source of knowledge.

Without the external STG interpreter I could not get data/evidence to lead me to the right direction. So good tooling is the most important thing. It is impossible to figure out the ultimate problem solution in one step.
My current goal is to learn the runtime behaviour of Haskell programs, build intuition then design and implement a good compiler pipeline with as much automation as possible. (i.e. generate RTS)

8 Likes

Thanks for the extensive response. I mailed the main developer behind UHC and he also noted that the important nuance is that whole whole program optimisation is not feasible, but you can be a bit smarter in deciding which parts of the whole program optimisation are essential and focus on those. He remarked that supercompilation is similar in that regard. And even if compile times are longer than GHC, if the produced binary is significantly faster then it might still make sense to run those optimisations, but perhaps not for every build. I’m more optimistic now.

Although, one point that I’m still questioning is the fact that GRIN seems to be untyped. I think you also mention in your paper about a modern look at GRIN that this makes it more difficult to compile to LLVM. But more importantly one of the biggest lessons of GHC is that a typed core language is very useful in finding compiler bugs. I wonder if it would be worthwhile to explore a typed version of GRIN.

HRC’s IR is called MIL it is typed. GRIN can be typed as well.
Automatic invariant checking is always a win.

4 Likes