It is our great pleasure to announce the first OSS release of guardian, a dependency boundary constraint checker for Haskell monorepo!
If you have a trouble with Haskell monorepo with increasing complexity of the package dependencies, then guardian could help the situation!
It is developed at DeepFlow, Inc. and has been used almost for two years.
With guardian, you can divide the packages into several “domains”, and define the allowed dependency between those domains.
Guadian will make sure that the only predefined dependencies between domains are introduced by package dependencies and report the result.
For more details and motivation, please refer to README.
According to the associated paper, the current effort to make a more modular GHC has completed a hierarchical restructuring of its modules (section 5.1). This means a layered approach to GHC’s design can at least now be envisaged (section 4.2).
To use guardian, would the GHC codebase need to be broken up into packages in order to enforce this layering, or can guardian also work at the level of individual modules in the codebase?
Thank you @atravers, for citing the interesting paper!
To use guardian, would the GHC codebase need to be broken up into packages in order to enforce this layering, or can guardian also work at the level of individual modules in the codebase?
Applying guardian to GHC sounds really exciting idea! As you thought, you need to break up GHC into several packages if one tries to apply guardian today. For the time being, guardian only treat package dependencies. Also, it natively supports only stack and cabal-install as a build system for now - that means, to use with custom build systems, like Hadorian for GHC, you must write a custom adapter to reconstruct package dependencies.
On the other hand, the core of guardian is relatively simple - it treats the dependencies between single entities and those of their classifications - in theory, you can write a custom adapter that treats modules or whatever as if it was a package.
So maybe not @hsyl20’s renovation of GHC (not yet anyway :-), but @Ericson2314’s future break-up of base could benefit from using guardian…along with existing library packages:
First Shake, then Hadrian: ghc --make next week? But (for now) @bgamari will definitely know more than me about the intricacies of building GHC: I will defer to his judgment on that matter…
For what it’s worth Hadrian largely defers to Cabal for package building and each of the constituent packages of GHC are, to lesser or greater an extent, normal Cabal packages.
On the other hand, the core of guardian is relatively simple - it treats the dependencies between single entities and those of their classifications - in theory, you can write a custom adapter that treats modules or whatever as if it was a package.
That could work with GHC if we could create an entity from a subset of the modules, e.g. all the GHC.StgToCmm.* modules.
That could work with GHC if we could create an entity from a subset of the modules, e.g. all the GHC.StgToCmm.* modules.
That sounds interesting! Currently, guardian-as-an-executable does not support adapters other than stack and cabal-install. However, guardian-as-a-library already provides an abstraction over adapters, and the core logic doesn’t depend on the details of adapters.
So there are two possible ways:
Build a new binary depending on lib:guardian with custom module-level adapter implementation.
Enhance guardian-as-an-executable so that it accepts inputs from an external custom adapter (in the spirit of bios cradle of hie-bios in some sense).
Option (1) might seem relatively hard for those unfamiliar with the guardian internals.
It would be better to go with (2) for extensibility.
If we choose (2), an external adapter must emit the following information (in some suitable format; perhaps graphviz dot?):
The list of all entities (e.g. packages in the standard settings; modules in the hypothetical case of today’s GHC) in the project without external dependencies, and
The dependency graph between entities.
Given that, guardian takes care of everything left. As there is no restriction on the name of the entities, one can just use module names instead of packages’.
In the case of modules, one challenge is to collect module dependencies, of course. Package dependencies can be computed without actually compiling the code, but extracting the module dependency graph might need some computation, and perhaps must be done after the compilation. Perhaps we could borrow some ideas from graphmod.
Today, I worked on the PR along the line with (2). This PR implements the interaction with an external process (as outlined above) and opt-in wildcard support in domain member (e.g. packages or modules) definitions. Hopefully, this could open up the possibility of applying guardian on the GHC codebase, as suggested by @hsyl20.
I tested these new features with the module dependencies of guardian itself as follows:
With this file, guardian will interact with graphmod to extract the module dependencies of the package, instead of packages, and checks their borders. Note the wildcards in the adapter-* sections.
This can also serves as a blueprint for future division of packages.
Exact date is still unclear, but I want to release new version soon.