High-order Virtual Machine (HVM) an optional GHC-like runtime for rust with many comparisons to GHC

Hi everyone,

I’ve stumbled upon this : https://github.com/Kindelia/HVM/blob/master/HOW.md on twitter while reading a thread about different Garbage Collection tech for Unison by Paul Chiusano.

I don’t have a sufficient understanding of GHC’s internals to assess whether the claims in the article are valid/interesting and potentially usable as guiding ideas for improving GHC’s execution model, so I’m curious about anyone’s judgement on this.

I’m especially curious about the following excerpts regarding GHC’s handling of lambdas and thunks :

For example, Haskell is lazy. To avoid “cloning computations”, it implements thunks, which are nothing but memoized references to shared expressions, allowing the (2 * 2) example above to be cached. This solution, though, breaks down when there are lambdas.

And then

References. They ruin everything. They’re the reason Rust is so hard to use. They’re the reason parallel programming is so complex. They’re the reason Haskell isn’t optimal, since thunks can’t share computations that have free variables (i.e., any expression inside lambdas). They’re why a 1-month-old prototype beats GHC in the same class of programs it should thrive in. It isn’t GHC’s fault. References are the real culprits.

And the purported solution beeing a lazy layer-by-layer clone (like the rust clone, but lazier).

HVM’s runtime has no references. Instead, it features a .clone() primitive that has zero cost, until the cloned value needs to be read. Once it does, instead of being copied whole, it’s done layer by layer, on-demand.

Why are thunks unable to share computations with free variables, and what does it mean in practice (as in, in a real world program) ? If this is all true as-is (and no nuance is needed), can this idea of a lazy clone primitive be implemented in GHC and would it make sense ?

And then there is the part about beta optimality :

Similar uses of this idea can greatly speed up functional algorithms. For example, a clever way to implement a Data.List would be to let all algorithms operate on Church-encoded Lists under the hoods, converting as needed. This has the same “deforestation” effect of Haskell’s rewrite pragmas, without any hard- coded compile-time rewriting, and in a more flexible way. For example, using map in a loop is “deforested” in HVM. GHC can’t do that, because the number of applications is not known statically.

Why GHC can’t know the number of applications statically ? My understanding is that it’s simply because we can’t know when the recursion end (because infinite lists are allowed), but then is the proposed solution preventing inifinite lists or is it something entirely different ?

Thanks for any explanation

4 Likes

Purely from a lambda calculator evaluator perspective, HVM seems to be on par with GHC or even beating it due to automated parallelism in HVM. That is impressive.

On the other hand, practical performant programming currently involves loops and arrays. I think that is where GHC shines as it is able to convert many high-level programs into low level loops and it gives you access to arrays if you need those. HVM doesn’t have support for that and it may be difficult for it to support that. So in the short term while we are still writing mostly sequential code it seems like GHC still has the upper hand.

However, it seems like we are moving towards a future where our CPUs get more and more cores, so parallel programs might have a big advantage in the future.

I expect that modifying GHC’s internals to adapt these ideas from HVM is nearly impossible. I think the best course of action would be to completely replace the STG back end with HVM. Perhaps many Core optimizations would have to be rewritten as well. I think there has been some talk in the past about adding multiple back ends to GHC in the context of GRIN (Graph Reduction Intermediate Notation), but that still seems like a very distant future.

To be honest, I don’t know why knowing the number of applications would enable GHC to optimize code more in this case.

3 Likes

…if anyone’s interested.

3 Likes

As for the evaluation mechanism, this looks relevant:

The (In)Efficiency of Interaction (Beniamino Accattoli, Ugo Dal Lago and Gabriele Vanoni.)

It shows that in exchange for having good space behaviour, abstract machines derived from the workings of interaction nets can potentially have appalling (exponential) time behaviour. Where the HVM exists in this time-space trade-off is an exercise I leave for its designers…

2 Likes

…with such an endeavour possibly assisted by:

A Representation of Lambda Terms Suitable for Operations on Their Intensions (Gopalan Nadathur and Debra Sue Wilson.)

https://dl.acm.org/doi/pdf/10.1145/91556.91682

It describes the use of suspensions to delay substitutions in the result of a lambda-abstraction, permitting beta-reductions to be atomic operations.

1 Like