With the recent focus on improving parallel compilation at the module level, with semaphores (and the problems of having cabal-install and GHC having shared custody for semaphore management across multiple libc implementations), I have been broadening my horizons on what a compiler should look like.
Moreover, Speaking with Alan Zimmerman during the Haskell Ecosystem Workshop (co-located with ZuriHac 2024) pointed me in the direction of C#'s compiler, Roslyn, and the paradigm of query-based compilers. Chatting with Moritz Angermann and Andrea Bedini was also very eye-opening!
The idea would be the following: An alternative Driver for the compilation pipeline would be implemented to act as a daemon that takes requests (much like a normal server), with such queries like:
- Compile this module to native code
- Fetch the type of this declaration
- Completions for members of a record
- Docstring for an identifier
etc etc.
And the command-line tool ghc
would be a client for this daemon.
One big advantage would be that ghc-daemon
would be in charge of scheduling the building of modules across the capacities of a machine. An interesting design question would be “How many daemons should live on your system?”. For now I envision one daemon per user, so that we avoid the problems of having a daemon that can write and read everywhere on the filesystem.
cabal-install
and ghc
wouldn’t have to coordinate through semaphores on disk for such a thing. cabal-install
would call ghc that would send the compilation order to the daemon (or start the daemon if it is not started) and get information about the build in return.
I have been prototyping on my spare time with a toy compiler in order to get a feeling of how things should fall together. I’d be interested to chat with folks who are interested!
Now you may think “I know just the thing that does that”, and you would be right to think about HLS! It’s not an entirely new idea nor is it unheard of.
I haven’t been talking about distributed compilation, but that is something available today with the the external interpreter
Here is a very simple diagram to illustrate things a bit:
So yeah, let’s bring the client-server compiler architecture to a new level!
References (please do read / watch them, they explain things better than I would in a forum thread):
- Query-based compiler architectures: Query-based compiler architectures | Olle Fredriksson's blog
- Anders Hejlsberg on Modern Compiler Construction: https://www.youtube.com/watch?v=wSdV1M7n4gQ
-
VK’s
nocc
distributed C++ compiler daemon: nocc/docs/architecture.md at master · VKCOM/nocc · GitHub - Modularizing GHC, §4.2 Layering and Componentization: https://hsyl20.fr/home/files/papers/2022-ghc-modularity.pdf#section.4
PS: I realise after having written this that the subject was brought up by @brandonchinn178 in 2023 in GHC build server for optimizing one-shot compilation