I’ve started working on a project with the goal of eventually publishing multiple interlinked packages to hackage. Having to update cabal files with new modules really grinds my gears, so I was hoping to use hpack or stack. However, most multi-package libraries I see in the wild seem to stick to just cabal.
As such, I was hoping for some help on:
If anyone has an example multi-package repository that they’re happy with
Insight on why stack/hpack is rare for libraries
Any other advice on structuring/maintaining Haskell libraries would be greatly appreciated too! I’ve been having trouble finding documentation on the matter.
I don’t know about packages I’m happy with, but amazonka, cabal itself, and tasty come to mind.
IMO you want cabal for libraries because you want to test across multiple GHC versions and different combinations of libraries. Stack is centered around reproducible snapshots, which is good for freezing a build graph and ensuring everything will always build the same way forever, but that’s explicitly not the goal for libraries.
You can use different snapshots with different GHC versions with stack, and some of my projects used to do that. But it’s annoying to have all those files, and you dont care about reproducibility anyway. In fact, you often want to always test against (at least) the latest versions of everything so that your library works against the most recent versions of all your deps
I’ve never understood the point of multi-library packages. For Bluefin I just have multiple packages in the same repo, plus a top-level cabal.project: GitHub - tomjaguarpaw/bluefin
This is the first time I’m doing a “multi-library” style thing though, so there may be something I’ve missed.
a multi-package library, eg a Stack (or Cabal) project with more than one package. Prior to the introduction of cabal.project, multi-package projects was (I believe) a competitive advantage of Stack; or
a multi-library package, that is, a package that makes use of one or more public sub-libraries? Stack could only handle these from Stack 2.15.1 (released 2024-02-09).
I only use it one or two projects, but for me it’s mostly about avoiding repetition, i.e. being able to use common stanzas. But this is arguably just covering for other shortcomings, and is a bit less of an issue now that we have GHC202X for example. Plus it comes with the major downside of not being able to independently version things.
You can use my Gild tool to do this automatically, without using hpack.
Also I would recommend against a single package with multiple public sub-libraries. It kind of works most of the time, but you’re bound to run into some annoying paper cuts.
This seems useful. If it’s not possible to do the same with a cabal.project that defines common stanzas for the cabal files it references then that sounds like it would also be useful!
Michael Snoyman’s advice was for Hpack users to check-in the generated Cabal file at a package’s repository. So, I think your question could be framed as: Why does a Cabal file (on Hackage) not (often) list the corresponding package.yaml?
That is true, even, for stack and pantry and other libraries that have spun out of the Stack project.
In the case of Stack, it used to list its package.yaml as an extra-source-file. That changed in April 2018 (before my time) and was documented as: “leaving out package.yaml because it causes confusion with Hackage metadata revisions”.
That is, on Hackage, somebody can revise a Cabal file describing a package but they can’t revise a package.yaml file. So, the Hackage Cabal file and the package description in its package.yaml could ‘get out of sync’ - a recipe for confusion.
a multi-package library, eg a Stack (or Cabal) project with more than one package. Prior to the introduction of cabal.project, multi-package projects was (I believe) a competitive advantage of Stack; or
I’m pretty sure that this would be a multi package library.*
The project is fleshing out the very incomplete tree-sitter Haskell bindings and implementing optics for them. I expect I’ll end up with:
A “core” package implementing bindings against the tree-sitter c runtime + common types and typeclasses
A package implementing codegen from grammars
A (mostly codegened) library per tree-sitter grammar
(And leave the door open for van Laarhoven bindings without a dependency on optics-vl)
With the expected pattern being that users take a direct dependency on the optics bindings + relevant grammar(s).
I’d like to keep everything in a single repository and put e.g. compiler flags in a single place
I am aware that an official tree-sitter library exists, but the repository hasn’t had prs reviewed in a while, and I don’t have the emotional bandwidth to try and revive it right now or try and foist my own opinions on the project.
I find Stack’s “Stackage snapshot” model of the universe to be a poor fit for developing libraries, and most defensible at or near the leaves of a dependency tree (applications, internal proprietary code, etc). When developing public Hackage libraries, I want to support the widest practical range of dependencies so that my users have the greatest chance of finding a build plan, and to communicate that plan with bounds so the cabal solver can work. This also makes upgrading individual dependencies much easier for both me and for library users. The PVP provides good guarantees about what bounds are safe to use, and we have a pretty good community culture about not breaking backwards-compatibility in minor releases.
As for hpack, I believe that YAML is a bad file format and that avoiding it wherever possible generally makes things easier. Between common stanzas to DRY up dependencies and tools like cabal-fmt to automatically maintain the module lists, I don’t see any practical benefit to adding another tool to the mix.
If your optics are only going to be lenses and prisms over record fields and sum type constructors, you can avoid providing lens or optics packages at all by providing Generic instances on those types. Both ecosystems have tools for generating lenses and prisms by type or by name.
I’d expect tree-sitter stuff to involve the sort of data structures where you’d want to provide (potentially indexed) folds and traversals. If the optics allow, you might find a good balance between features and dependencies by declaring van Laarhoven optics in the main package (since you can often do that with just a base dependency anyway) and declaring optics-optics as wrappers over the VL-optics. Then you only have to write them once.
FWIW I do see a moderate amount of package.yaml use. The main sticking points are:
Hackage only understands cabal files, so you need to include one in the package anyway; conversely, it’ll only add a package.yaml to the package itself if you add it to extra-files. So you’ll only see them in development repos.
Beyond a certain point, package.yaml is insufficient and you’ll end up using a cabal file directly.
As mentioned by others, freezing your dependencies tightly as Stackage snapshots do (note that you also don’t get to distribute cabal.freeze usefully for cabal install builds) puts tight limits on who else can install your package, because it also implies GHC version freezes.
I will also disagree slightly with the opinion that cabal is better than stack for development, as I find that during the development phase it’s easier to stick to a single environment (e.g. Stackage snapshot or cabal.freeze or even freezing the Hackage index state) and leave testing other configurations to CI.
(FWIW, cabal itself freezes the index state for most development, but validate runs in CI use an unfrozen index state and no freeze file so we find out quickly if an upstream dependency change impacts us. This is imperfect when bootlibs change, but that’s being worked on; currently cabal-install freezes bootlibs implicitly when this is overkill for the majority of people (it only matters for users of ghc-api).)
IMO the best reason to use multiple public libraries in a package is to capture the not-uncommon situation of a set of libraries which in practice are always released and used together at matching versions. That is, if you have no desire to support foo-a-2 building with foo-b-1 , then it’s simplest to just publish everything together under one version number.
This is particularly relevant for libraries that are developed in the same repository, as it’s generally not very easy to test version combinations. Locally, everything is pinned by your source-control revision, so it works rather similarly to publishing everything under one version.
This is a familiar situation from executables. You can put your executable in a separate package from your library, but this is only really a win if you want to support building the executable with a range of library versions, and generally you don’t. Easier to have one version number.
This isn’t possible with publishing to Hackage: hackage packages don’t have cabal.project metadata, so you’d have to “reify” that out into the cabal files, at which point you have something more like hpack anyway.
The term “multi-package library” creates some confusion because it’s so close to “multi-library package” and because in this context “library” often means a library component defined in a cabal file.
hledger is one example of a “multi-package project” (providing both applications and libraries) happily using both stack and hpack over a long period.
More details:
stack and hpack are of course independent.
I chose stack for UX and reproducibility, eg avoiding wasted time due to variance in build plans chosen for end users or in automated workflows.
I chose hpack to minimise boilerplate and duplication across packages, avoiding wasted time from tedious and error-prone manual edits. Also for the more regular and familiar syntax.
With modern cabal these reasons are a bit less strong than they were originally, but still quite strong for me.
I provide a stackVER.yaml file for each major supported GHC version.
And I follow best practice and include the generated .cabal files in version control.
To minimise version control conflicts in those, I tend to (1) commit them separately from other files and (2) regenerate them only with the current release of stack and its built-in hpack.
Why is stack/hpack rare for libraries - are they ? If so, I would guess it’s
to avoid unnecessary moving parts - when they’re not needed, then less is more
the usual headwind against them in some parts of the community, coming from (I imagine) (a) desire for a simpler Haskell dev experience and story, and (for stack) (b) memories of old ecosystem dramas and (c) a preference for flexible automatic build plans without needing to think about snapshots (as a higher priority than reproducibility).