Perceus reference-counting memory management and Haskell

LaurentRDC · June 26, 2023, 11:59pm

Hello all,

I was reading about the Koka language (for its effect system specifically) and came across the fact that it does not have a traditional garbage collector. Rather, the memory management of Koka programs is done automatically through what is called Perceus.

Is there a fundamental reason this is impossible/undesireable to implement in GHC, or is this just a matter of putting engineering effort into it? Sadly, I don’t have access to the publication, which may already explain why/why not.

gnyeki · June 27, 2023, 1:22am

Here is an ungated copy of the paper:

It is dated June 2021 so it is probably the same as the published paper.

mduerig · June 27, 2023, 6:26am

The Koka manual has a couple of more references, which seem to be accessible.

Kleidukos · June 27, 2023, 7:59am

My understanding is that generational garbage collectors are very good at processing a bunch of “garbage” with minimal effort. Instead of deallocating every single thing by hand (one free per object), you move the objects you keep (usually far less than what you throw) and re-assign the previous zone to be overwritten for allocating newer, fresher objects. Suddenly you’ve cut the amount of calls to free and it’s no longer a bottleneck.

You could take a look at Garbage Collection · OCaml Tutorials for a bit more details on this technique that we also use, and of course garbage collector notes · Wiki · Glasgow Haskell Compiler / GHC · GitLab

romes · June 27, 2023, 9:20am

OCaml’s website is looking quite good and polished.

nemo · June 27, 2023, 4:08pm

Perceus looks super interesting! Thanks for sharing (:

ReleaseCandidate · June 27, 2023, 6:03pm

The paper to read before Perceus is bean counting: Counting Immutable Beans: Reference Counting Optimized for Purely Functional Programming - the RC of Lean 4.

sgraf · June 29, 2023, 11:07am

Reference counting as implemented in Koka or Lean requires a heap that is acyclic, or at least a heap that can be partitioned into a DAG of compact regions, each of which are collected as one. Lazy evaluation and more generally mutability is fundamentally at odds with that, because it introduces cycles in the heap.

If you want Lean’s RC, you have to give up on thunk update. E.g.,

ones :: [Int]
ones = 1:ones

main = print $ ones !! 100000

will no longer operate in constant space, but rather needs to allocate each cons cell anew.

Of course, in this case you could simply allocate ones into a compact region and maintain the DAGs-of-regions invariant. In general, it is difficult to predict statically how big a region must become to eliminate all cycles. You could declare the whole heap one big region, but then you can only free it at the end of the program, e.g., you lose precision/space safety.

Is it worth giving up on laziness and unrestricted mutability to have a simple and efficient GC algorithm? Perhaps it is! Time will tell and we surely will have a good comparison in a few years of time.

david-christiansen · June 29, 2023, 2:20pm

Aside from the issues with laziness and mutation, it’s not clear to me that Haskell would benefit well from a radical change to memory management technology. Much of the value of Haskell comes from the existing ecosystem of code, and that code has been written assuming the performance tradeoffs of a traditional tracing GC.

Part of using Lean 4 effectively is ensuring that data structures are unique to allow mutation instead of copying, and that has a pervasive effect on the style of performance-sensitive code, which is different from the way one would write performance-sensitive Haskell.

LaurentRDC · June 29, 2023, 2:40pm

I’m not longing for a different memory management mechanism in Haskell, and I certainly am not suggesting that real engineering efforts should be made in this direction.

This is more of a question about the differences between koka and Haskell that allow koka to not have a garbage collector, while simultaneously sharing a lot of DNA with Haskell.

snowleopard · July 8, 2023, 8:42pm

FYI: Elton Pinto and Daan Leijen (one of the authors of the Perceus algorithm) will present a paper on “Exploring Perceus For OCaml” at the ML workshop this year. There is also a master thesis on this topic by Elton Pinto – you might find some answers there.

Topic		Replies	Views
GHC and garbage collection Learn	1	465	November 23, 2024
Reference Counting with Linear Types Show and Tell	19	5452	June 17, 2024
Haskell Implementors’ Workshop 2023 Individual Talk Videos on Youtube Announcements	3	754	November 28, 2023
Performant haskell seems like a real pain Learn	2	856	December 3, 2020
High-order Virtual Machine (HVM) an optional GHC-like runtime for rust with many comparisons to GHC Learn	4	3324	June 28, 2022

Perceus reference-counting memory management and Haskell

Related topics