Haskell CI Group Notes
- Thursday, September 14 2023
Discussion summary
- A tale of how IOG discovered the value of version bounds
- Thoughts on approaches to keeping version bounds up to date
- Clarification on platform support, specifically Mac and Windows
- Questions about Docker images
Discussion topics
cardano-haskell-packages discovers version bounds
My fuzzy recollection of @andreabedini’s comments were:
- Andrea joined the project and simply couldn’t find a working build plan
- Too many packages
- No bounds specified anywhere
- Working combinations discovered by trial and error
- Bounds were introduced
- The discussion on conservative upper bounds was rehashed
- They settled on a policy of taking responsibility for adding an upper bound to dependent (downstream) packages when introducing breaking changes.
Thoughts on approaches to keeping version bounds up to date
cardano-haskell-packages’ policy
The policy for cardano-haskell-packages is facilitated with tooling.
- [Not stated explicitly, but I assume] you are alerted when you break a dependent package
- You can “press a button to add an upper bound to the dependent package”
- Alternatively you can patch downstream, or roll back your own change
This policy works best in a closed package system, but with enough build horsepower we could imagine automating such a thing for a critical subset of Hackage.
Terminology for use of upper bounds
Andrea introduced the terms “early integration” and “late integration”.
- late integration: use conservative upper bounds; relax when known-good
- early integration: use liberal upper bounds; tighten when known-bad
As I write up these notes, I note that “early integration” is the epitome of “move fast and break things”. In open package sets, packages with improper upper bounds can wreak havoc on build plans. But in closed sets, it is an optimization for the common scenario.
Both integration types would benefit from automation.
Discussion on platform support
How to handle MacOS
Hosted?
Hetzner used to have MacOS options. They might still, but they used to, too.
gitlab.org may have public MacOS runners soon.
Reproducible environments?
NixOS Foundation, as of 2019, handle MacOS runners and the challenge of
reproducible build environments by:
- Operate a fleet of Mac Minis
- Run Linux on the devices (presumably NixOS)
- Run virtualized MacOS on the host Linux system
This satisfies the requirement of running Mac software on Mac hardware, but gives a lot more control.
A newer option is macos-ephemeral:
- Register the devices with the Apple fleet management service
- Automate the wiping and reinstallation of the OS through fleet management
Building for Windows
We talked about a few different things here. I did a bit of research afterward to make sure I understood some things correctly. (Mistakes are still my own.)
There are a couple options for building for Windows.
Cygwin
- Aims to be a POSIX environment that runs on Windows.
- Applications compiled by its toolchain depend on a compatibility DLL.
- By definition, does not provide a native Windows experience.
MinGW
- A GNU toolchain for building native Windows apps
- Since it uses GCC, MinGW-compiled C++ is not compatible with MSVC-compiled C++
- But C code is ABI-compatible
MSYS
- A stripped down POSIX environment (originally Cygwin) suitable for building Unix packages
- Uses MinGW toolchain, so apps don’t depend on the Cygwin compatibility DLL.
- Note that MSYS2 is an independent rewrite of MSYS
Clang
- Can (cross-)compile to MSVC-compatible object code
Interesting thing to note: Rust provides two variants of Windows bindist.
- windows-gnu: Presumably compatible with GCC-compiled code
- windows-msvc: Presumably compatible with MSVC-compiled code
Questions about Docker images
I asked about interest in corralling the menagerie of Haskell-related Docker images in the wild. Docker is in widespread use for CI and production systems, meaning a lot of people building Docker images. Opportunity for consolidation?
- GitHub - haskell/docker-haskell exists, but maintainers are asking for assistance
- It was noted that Docker could be a distribution target for GHC. It would make cross-platform support Docker’s problem.
- Could help with repro cases, too
I also asked about downloading dependencies in CI: out-of-band versus in-band.
- out-of-band: Baking dependencies into the CI Docker images.
- in-band: Fetching dependencies as a (pre) build step
In my experience, many mature projects build their own Docker images for CI, but still end up fetching a lot of things as a CI step. I.e. they do both in-band and out-of-band.
- GHC is an example
- Downloading dependencies can dominate job time. Somebody gave an example (not GHC) of a
22 minute job that was fetching (and building?) dependencies for 20 minutes.
Is in-band better than out-of-band or vice versa? Unfortunately this is probably a topology question. Which mechanism is faster depends on where the bits are coming from. That’s usually bespoke for mature projects. I’m interested to hear more experience reports.
Finally, nix-snapshotter was suggested as a way to deduplicate data on Docker images.