"Modularizing GHC" paper


We are happy to announce the first public release of a paper we wrote about modularizing GHC!

It’s related to the following conversation here: Modularizing GHC, do we want an HFTP?


Many of GHC’s issues have occurred in some part because GHC developers more often than not tend to spend a huge portion of their time working on GHC rather than other codebases. [from the end of the paper’s Introduction]

Objection: speculation, your honor.

  • I don’t think it’s true GHC developers only work on (or have only ever worked on) GHC. As far as I’m aware, they all work on (or have worked on) other large-scale systems/in other languages and paradigms.
  • Even if this claim were true, you haven’t demonstrated this causes “many of GHC’s issues”.
  • Even if it’s caused some of the issues, do you really want to say in effect that GHC developers are incompetent/inexperienced?

I suggest you just cut out the insults, and focus on what you’re trying to fix; not how GHC got to be how it is. You’re already saying GHC was at first engineered as a monolithic system to compile source from file; and that other use cases like REPLs, IDEs, linters, cross-compilers weren’t anticipated.

I don’t perceive that as an insult.

Programming very often leads to tech bubbles and programming takes a huge amount of time, especially working on a compiler. Competent compiler engineers, like all our GHC devs are, are already rare, but I’d guess it’s even more rare for a dev to have spent considerable time on two large scale compiler projects.

And once you get used to a codebase, it get’s harder to realize architectural shortcomings, because you understand the code anyway. I think we’ve all experienced this.

So, fresh contributors or casual contributors very often see things quite different.


I think this is fantastic and I’d love to see similar whitepapers about solutions to hard technical problems in the Haskell community.


I think it’d be great to see this on HackerNews :slight_smile:

This paper was incredibly engaging and well-written; best of luck in the slog!


Thanks for the paper! There is much good stuff in it.

Here are some thoughts.

  • Although GHC is complex, hard to modify, and often ill-structured, I think it is also pretty good in many ways, including:

    • It is a thirty-year-old piece of software that is still in a state rapid development – and in its core pieces too, not just tinkering about the edges. I think that’s a testament both to Haskell’s static type system which supports fearless large-scale refactoring (as the paper says), but also to a coherent overall design, and a decades-long process of relentless refactoring. Every bit has been rewritten, often more than once!

    • Many of the internal structures, notably the design of Core and STG, have been remarkably stable over time.

    • GHC has absorbed huge changes in its core mission: the design of Haskell itself. That kind of thing can lead to a huge accumulated techincal debt, but in GHC’s case much of it (not all, I will be the first to agree) has been paid as we have gone along

    All that said, the kind of refactoring described in this paper is warmly to be welcomed.

  • The paper makes a many good points, but I think they all come down to two, in the end (I say this partly to see if others want to suggest other key points):

    1. Reduce coupling between components
    2. Don’t pass junk; that is, functions should not take arguments that they ignore.
  • The paper elaborates on (2) at length (wrt DynFlags). But one merit of (2) that is not really discussed is its impact on (1). If many module take a DynFlags input, then they must depend on the DynFlags type, which in turn depends on all the data types in DynFlags. By narrowing the inputs to a function, you narrow the number of types that function depends on, and hence on its (transitive) module dependencies, i.e. its coupling. Less junk leads to less coupling, and this may be the most important merit of (2).

  • So perhaps (1) “reduce coupling” is ultimately the main message of the paper, and indeed Section 5 opens with a plan that is solely about reducing coupling.

  • I note (in 5.2.1) that the paper recommends building a smaller version of DynFlags for (in this case) the Cmm code generation. That’s ok, but it cuts against the recommendations of 4.2.1. It would be nice to acknowledge the tension here, and that it is no fun to define a fresh data type for each function, containing only the fields that function needs. (Or, I suppose, a zillion implicit parameters.) Essentially I think there is implicit agreement here that the DynFlags/HscEnv pattern is OK (desirable, even), but only at a smaller scale: within one “component”, whatever that is, rather than the entire software system. I don’t know what a general guidance rule might be.

  • One big thing that the paper doesn’t discuss at all is the design of the API of the GHC library described in Section 2. The GHC library API has evolved, based on what GHC happened do to, rather than been designed based on what would make sense to clients. Moreover, to get their job done, clients often call functions defined deep into GHC’s innards that were never specifically designed for external use.

    I have always thought that GHC-as-a-library would benefit from a thoughtful, client-led redesign that starts from a blank sheet of paper, sketches a design that makes sense to a client, and then refines that design in the light of the many complexities dealt with by the current API. Good library design is hard!

  • Even on the coupling story, I’d love to see the next layer of detail on what needs to be done, very concretely. For example, what to we need to do to uncouple the parser from the rest of the compiler? Ditto Language.Haskell.Syntax should in principle be separable into a separate package. Maybe identifying sub-goals like these, and then giving a checklist of what needs to be done for each, would help to unlock volunteer effort?


Having some familiarity with Domain Driven Desing, I found two things which sounded a bit odd, while skimming fastly through the paper:

  • not finding a mention of Bounded Contexts, which could be a great tool to untangle a bit of complexity;
  • the proposal to use a layered architecture instead of an hexagonal one, which generally provided better decoupling
1 Like

I’m not sure all concepts from OOP can just be imported into GHC. Compilers have to deal with intrinsically recursive data structures, rather more than commercial applications. Maybe worth considering hexaflexagon architecture.


John did that: Modularizing GHC [pdf] | Hacker News


Haskell (not just GHC) could do with a way to smash together some ad-hoc set of values/"component"s, without needing to pre-define a datatype, label each component so the type system can ensure producer and consumer handshake on the set, and provide a lightweight way to pull out by label or poke a new value by label, without exposing how those elements are organised inside the datatype.

Oh, I just re-invented anonymous records, like they have in purescript (based on Javascript), like they have in nearly every durned language, like they have in Haskell.

cmmFoo <- cmmdoFoo (cmmProfile = cmmProf, cmmDoLinting = True)

Appearing soon - Chugs: the Compiler for the Haskell User’s Gofer System!

(…darn, it’s already 2022 May! Oh well.)

Assuming I understand the article correctly [!!!] I think the intention or goal is for:

  • the software system is built from a set of packages,
  • each package adheres to the single responsibility principle,
  • and within each package there is a dedicated type for its configuration (flags, environment, etc).

Admittedly this is more work: most packages instead seem to aim for a more coarse “level of componentry” e.g. libraries or relatively self-contained programs. But the “remodulization” of ghc-lib could allow for dual strategies:

  • if it’s obvious that a set of modules adheres to the SRP, they can be packaged up almost immediately;
  • where it isn’t so clear (the general case) a small library package can be used, this deferring the the final SRP-driven separation for a later date.

Again, all this assumes I understand the intent of the article - @hsyl20, would you care to comment on my ramblings?


label each component so the type system can ensure producer and consumer handshake on the set,

Oh, I just re-invented anonymous records, like they have in purescript (based on Javascript)

A possible disadvantage of anonymous records is that often you want dependency injection to be type-driven: you don’t really care about the field name of the component in some environment, you just want a Logger, or a Repository, or whatever. One way of achieve this in Haskell is through a typeclass that says “this global environment has a component of type Logger”.

Another possible approach is the one used by the registry package: don’ rely on typeclass magic to wire components together. Instead, analyze the functions using the reflection facilities of Haskell and perform the wiring in library code.


I’m not advocating for anonymous records (although I often would like to reach for such a tool if it would be available); but it’s a familiar concept that allows me to bypass the more advanced Haskell type juggling. As it can be seen from my past questions in the learn category, my questions are all around how to control the typesystem to do what I want. For me, currently, the reflection facilities linked seem like impenetrable material without a simple practical example to start from.

As a non-type afficionado, I really don’t see the mismatch between anonymous records and dependecy injection like mechanics. In OOP language where DI is used, generally you tend to write programs based on interfaces and the DI library will provide an interface conforming implementation at runtime/compile time based on some config. Typeclasses could be seen as interfaces/abstracts, but my very vague understanding is that backpack is more closesly comparable to interfaces, and the way you’d implement DI like systems. Or am I misunderstanding what backpack is supposed to be?

1 Like

The way I see it, “reduce coupling” is an implementation detail of having code that better-reflects the conceptual model behind it. That’s an important distinction because there are a lot of ways to reduce coupling at the code level without making code any easier to understand, or even making it substantially worse. (I have experienced this pretty sharply first-hand :P)

…but only at a smaller scale: within one “component”, whatever that is, rather than the entire software system. I don’t know what a general guidance rule might be.

My personal guideline is whether a component—however it’s reflected in the code—makes sense on its own. Core and STG are great examples: both of those are constructs that can be thought about (and are worth thinking about) without needing to know much about GHC or even Haskell in general. I try to do the same thing with as much of my code as possible: organize the code around self-sustaining ideas and lean against defining abstractions that only make sense in the context of the rest of the code. The ideal would be for the smaller record types to represent useful concepts on their own, but if the corresponding component is clearly defined, a record for component-specific options makes sense as well.


Thanks for reading though and leaving this thoughtful response, @simonpj. We appreciate it.

I have started drafting what I hope will become a combined response to your points from the 3 of us, sparing you from reading a half-baked “thinking aloud” comments from each of us in turn. But before that is ready I wanted to give you a “heartbeat” lest you think we had missed your message.

Great, thank you John

1 Like

OK. Here it is from the 3 of us:

Yes, certainly we appreciate GHC is far from the worst offender here, and that it’s a blessing we can have this conversation at all. Much the same things could be said about e.g. GCC, but refactoring is so much harder in C or even C++!

Yes, the most central domain concepts like Core and STG have seen the most stability. The surface language and compiler passes on it (parsing, renaming, and desugaring) have had more churn, but also plenty of attention to keep up with them. In many cases, the mission changes and resulting tech debt are separate from the language proper (driver concerns, plugings, HLS), or on the “periphery” of the language (TH, backpack, …). These areas get less attention “in passing” from other work, and thus are now in need of more dedicated clean-up effort.

Glad to hear it!

We agree with this assessment — the global costs of passing junk far exceed the local costs, that “reduce coupling” is the main message, and that the impact of (2) on (1) is not directly addressed.

However, we would like to highlight a third major point: The process by which this situation arose is interesting in its own right (Section 4.2.5). The situation surrounding DynFlags grew organically and so we think that it is important to understand the genesis of the technical debt. Without understanding that genesis, we risk returning to the present scenario after implementing a solution.

We believe that laying out this genealogy in detail is useful data for GHC developers, for other practitioners of Haskell, and that with this data we can begin to define best practices for the design of large systems in Haskell.

Actually, we would argue the resolution is that configuration records are never good, but finer-grained records are significantly less bad.

In Section 6.4.2, in the “Composition over configuration” paragraphs, we include some discussion of this tension. (Upon rereading it this section is rather abstract and unclear, hopefully our discussion here can help lead to refining it!) “No configuration, all composition” is the ideal, and this is achievable at least on the library level.

For example, libraries like bound are a great example of this ideal and the single-domain-concept. Bound doesn’t care what your AST is at all, and just exists to handle variable binding in the abstract. There is nothing resembling a configuration record, and instead flexibility is achieve by leveraging classic abstractions in the monad and monad-transformer vein. The utility of those abstractions for this task is far from obvious, but once one gets used to the concept it nicely works with so many other libraries in the Haskell ecosystem.

All that said, we’re the first to admit its really hard to imagine how GHC could be refactored to completely achieving that “No configuration, all composition” ideal. (Indeed, past conversations on ruling out GHC taken on e.g. dozens of Hackage deps would seem to rule it entirely.) To avoid getting bogged down in the what the “decade out” ideal might be, while bringing attention to the boring initial steps that should be both less controversial already and help align goals/imaginations once they achieve, we are happy to say CmmConfig and similar are good enough for now, and not worry about what comes next at this time.

(We want to be “refactoring tortoises”, focusing on putting one MR in front of another :), and not “refactoring hares” that dream big but land no patches.)

See the “caveat warning” in section 4.2.1 that discusses the pitfall of an XYZEnv. We want to establish a principle for avoiding both “zillion per-function fresh data types” and “zillion parameter” extremes, but this may be an open research question for software systems in strong static pure functional programming languages (or atleast for Haskell). Our argument here is closely matches what your reasoning, quoted above, about how “(2) Don’t pass junk” is a step to “(1) Reduce coupling”. We choose the middle ground erring on the side of more parameters and fewer records to avoid the temptation of reusing records and thus getting back to a DynFlags-like situation.

We would completely support a client-led API design for GHC the library! Indeed, trying to come up with the right abstractions and generalizations without motivation clients is a very infamous recipe for overeager overengineering and the “Second-system effect”.

The flip side, though, is projects like HLS are already contorting themselves — e.g. picking up obscure functions as you say — to get the job done.
A purely empirical approach can fall short and get stuck in local optimums when ideally both GHC and HLS would change considerably, the former change enabling the latter change.
Resolving that can be a bit of chicken-and-egg problem.

To avoid that, we specifically chose to focus on the types of cleanups that can be informed by the GHC libray itself — the beauty of “Domain-Driven Design” is that the domain really is client-agnostic to a significant degree, and thus freeing domain concepts from cruft is a local process that can proceed just by meditating on the code in front of us, and not worrying about any clients/UIs, be they the venerable GHC-the-program, or hypothetical unison-like IDEs that only exist as apples of our eyes.

This again relates to our approach of focusing on “boring”, immediate-term less-controversial cleanups — or as we said above being “refactoring tortoise” not “refactoring hare”. “Meditating on the code in front of us” might sound daunting and abstract to, but specific work like refactoring away DynFlags — especially at the current point where the problematic dependencies cycles that make this hard are largely cut — is concrete and straightforward!

We take your point, the roadmap, currently exists on GHC’s issue tracker and is linked from 5.2.1. We should be more explicit that this roadmap exists and how to find it. The “current tasks” items are in order, and in same cases already started. For the GSOC we hope will happen this summer, we will be working according to this roadmap.

I am happy to say since we released the paper, volunteer have started chipping in! For example,
wee MR!8160, which is near being ready to merge. We think this is a good sign others are able to follow our roadmap, and that by writing down the problem and solution we have lowered the barrier to entry for this project.

Also in the process of that MR we found items which we have deemed to be out of scope, and so we have been able to expand our checklist. Of course, we want to shrink, not grow the list of tasks over time! But while we are in the early days of trying to build enthusiasm and get more people involved, better to have a healthy detailed backlog than a vague sense of not being sure when we’re done. Residual tasks might unlock more parallelism too, as many of those are in the “oh, we found this additional nice-to-have cleanup” vein. More parallelism would make this an especially fruitful vein of work to mine for onboarding tasks to mint new GHC devs.

Also, I just learned from @mpickering about the configure import lints HLint supports that GHC already uses. I hope we can incorporate them too, to check our progress and prevent regressions.
They serve as a good intermediate step before we can actually split out separate packages for the parts which deserve it.

We do not yet know and have not characterized the work required to separate the parser or Language.Haskell.Syntax but agree that these are doable in theory. We cannot comment at an ideal solution, but based on our recommendations the parser must not be reliant on records like DynFlags. Otherwise, we risk a layer violation¹ that can tightly and transitively couple components, especially as the parser is often the first step in another service component’s workflow.

Thus, factoring out the Parser is best done after factoring out DynFlags and the AST, in fact these are probably milestones on the critical path for the parser. CountDepsAst and CountDepsParser are rather similar today — which was embarrassing to John when he first made CountDepsAst in Track the dependencies of `GHC.Hs.Expr.Types` (8ec6d62a) · Commits · Glasgow Haskell Compiler / GHC · GitLab, but he still maintains progress will come easier with the AST.

Additionally, factoring out the AST is more the Trees That Grow than the “purge DynFlags” of work, but we did try to touch on that in the TTG part of 5.3.1. John hopes to continue working with @romes to flesh out that effort’s roadmap too, after which we can update the paper with more concrete details.

¹: The grammar is in the domain layer, along with the idealized transformation from the grammar to the AST. The generated parser is a bit more nebulous as Happy is in effect mixing in infrastructure layer. Other jobs in the .y file, like handling exact print annotations, are also more infrastructure-layer tasks. So within the parser there is certainly more layer separating to be done. Nonetheless, DynFlags is not a domain layer or infrastructure layer concept, so we can be sure it doesn’t belong in the parser even before the parser’s internal layering is sorted out. Many of @int-index’s and John’s long-term goals for Happy make sense from a “Domain-Driven Design”-perspective and ought to help with this, but that is a separate discussion.


Thanks for such a comprehensive reply. Perhaps some of the thinking you describe above can make its way into the paper.

For example

To avoid that, we specifically chose to focus on the types of cleanups that can be informed by the GHC libray itself

It’s usetul to point out explicitly that there is a whole area of techincal debt and design that you deliberately don’t tackle. For good reasons, indeed, but it helps to frame your goals.

One other thought

However, we would like to highlight a third major point: The process by which this situation arose is interesting in its own right (Section 4.2.5).

I think you mean 3.2.5. What you don’t say explictly here is that in an imperative language there is often massive under-the-hood copuling. GHC started with DynFlags in a global mutable variable, as you say – which is the standard situation in an imperative language. We went against the spirit of functional programming, for the sake fo convenient, and lived to regret it.

So we gradually moved to a situation in which we pass DynFlags to the functions that use it. That was a huge upheaval. It made the coupling more apparent, but it reduced coupling (because not all functions take DynFlags) and made it possible to have different DynFlags in different places (e.g. when compiling different modules). But in an imperative language, this step might never have happend.


@mhitza you don’t need to learn about Typeable or Data.Dynamic to learn how to use registry. There is a small tutorial in the repository to get you started. I am happy to help you if you want to give it a go (in other news, there will be soon a registry-aeson package to provide JSON encoders / decoders that you can gracefully evolve when your API needs to be versioned. It relies on the same interface and implementation ideas).