Hi all,
I’ve submitted a custom GSoC proposal titled “Documenting and Improving Cmm”, and I wanted to share it here with the community — both for feedback and to spark discussion on some possible refinements.
https://drive.google.com/drive/folders/1Nw2h4cKNJBnU0yBEVXHNq0J4vw_IW27Y
The project begins with a documentation phase focused on Cmm, which is central to GHC’s backend but remains under-documented and fragmented. After that, the work could proceed in one of two directions:
Route A: LLVM Backend Improvements
- Replace GHC’s internal LLVM AST with a maintained library like
llvm-codegen
orllvm-pretty
, which now supports parsing and round-tripping textual IR. - This would simplify GHC’s LLVM backend and make it easier to support new LLVM versions.
- It could also introduce optional Cabal/Stack dependency support for the LLVM backend. Since the native codegen would remain available, GHC itself wouldn’t strictly depend on Hackage — only the LLVM backend would.
However, I understand that some GHC developers may be reluctant to introduce Hackage dependencies inside GHC, even if they’re optional. As an alternative, I’ve considered documenting and polishing GHC’s existing internal LLVM AST, and offering it as a standalone library. This could serve both GHC and external tools, while staying closer to the status quo.
Route B: SSA-Based Register Allocation (or ANF-Style Alternatives)
Benjamin Maurer previously attempted to improve register allocation in the native Cmm backend by introducing live range splitting. However, because variables in Cmm are mutable and can be redefined, lifetimes were hard to track precisely.
To address this, he introduced SSA-like annotations, giving variables single definitions and explicit value flow. Unfortunately, his implementation also introduced a novel SSA-aware graph allocator, different from traditional Briggs-style allocators. It turned out to be slower at both compile-time and runtime than the well-optimized linear allocator, and was never merged.
My proposed refinement is simpler:
Use Benjamin’s SSA representation, but pair it with the existing Cmm graph allocator — extended only with live range splitting. This avoids new allocator complexity while retaining the clarity of SSA.
Alternatively, I’ve been experimenting with ANF-style transformations that eliminate redefinition and mutation, offering SSA-like benefits without requiring explicit SSA conversion.
Example in pseudocode (not Cmm syntax, but translatable):
int sum_upto(int limit) {
int sum = 0;
for (int i = 1; i <= limit; ++i) {
sum += i;
}
return sum;
}
→ Converted to an ANF-like recursive form:
int sum_helper(int i, int limit, int acc) {
bool cond = i <= limit;
if (!cond) {
return acc;
} else {
int acc1 = acc + i;
int i1 = i + 1;
return sum_helper(i1, limit, acc1);
}
}
int sum_upto(int limit) {
int start = 1;
int acc0 = 0;
return sum_helper(start, limit, acc0);
}
Because there’s no mutation, lifetimes are precise and allocator-friendly. The only caveat is that recursive calls may exceed Cmm’s argument register limits (see GHC#24019). However i believe i can solve this problem with the llvm backend without introducing a performance penalty
Additional Thoughts
Unrelated to the proposal, I’ve also been exploring the idea of compiling Cmm to portable and fast C, as a lightweight backend. I’ll post more about that separately — it may be too much of a detour for GSoC.
I’d love feedback on any of this, especially:
- Whether reusing and publishing GHC’s LLVM AST is preferable to switching libraries.
- Whether SSA+existing allocator is a viable simplification of Maurer’s work.
- Whether ANF-style rewriting has potential for targeted optimization passes.
Thanks!
Diego Antonio Rosario Palomino
UTEC, Lima — GSoC 2025 applicant