GHC Blog - GHC on Apple M1 Hardware

To elaborate a bit further, we are not the only ones who have noticed that LLVM is slow. mesa (particularly amdgpu's shader compiler), and at least one JavaScript implementation have removed LLVM from their compilation pipelines due to LLVM’s compilation performance. Moreover, rustc has been integrating an alternate backend for non-release builds, also largely due to the compilation performance of LLVM. GCC continues to be faster than Clang in Linux kernel compilation.

Ultimately, LLVM is very good at producing efficient code, but not terrific at doing so efficiently. I suspect that the reason for this is much the same reason why GHC compilation performance is not great: it’s far easier to write papers about better code generation than about faster compilation.

Of course, LLVM does have commercial users who no doubt care about compiler performance; their efforts are in part why GHC’s LLVM backend is as competitive as it is, despite needing to serialise/deserialise a rather verbose textual intermediate representation. Nevertheless, writing a fast, memory-efficient, modular, compiler capable of sophisticated optimisation is not easy.

It also does not help that LLVM’s IR does not match GHC’s execution model particularly well. As a result we need to give up some efficiency (namely by splitting up proc-points into distinct procedures) in order to shoe-horn GHC’s C-- representation into LLVM IR. Kavon Farvardin previously tried working with LLVM upstream to extend the IR to allow a more direct mapping but there was some resistance from upstream (since all optimisations would need to account for the new construct).

Furthermore, no one has looked into which LLVM optimisations are truly pulling their weight (#11295). It would be really great to make progress on this issue in particular as I suspect there are some easy wins here.

4 Likes