Don’t edit dependency bounds manually, with this CI setup

Over at Dependency version bounds are a lie I argued that manually maintained dependency bounds are both annoyingly tediuos and also regularly wrong, at least in the sense that they are not actually tested by CI.

From the ensuing discussion I can conclude that the dependency bounds in your .cabal file should

  • never be written manually (yay!), and instead should
  • always be derived from what you are actually doing on CI.

The last point made me write a new tool, cabal-plan-bounds, that takes multiple of Cabal build plans (plan.json), and changes the dependency version ranges in the .cabal file to reflect the versions that are actually used on CI. I hope eventually this functionality will come with cabal, but for now a separate tool allows quick experimentation with this idiom.

A nice CI setup

The cabal-plan-bounds tool is but a small building block to whatever simple or sophisticated Setup you build around it, and in this post I want to show off and walk through a possibly setup I conceived for a typical Haskell library (using candid, but that doesn’t matter here), with the following nice properties:

  • The build matrix is not hard-coded in .github/workflows/ci.yml, but instead there is a directory ci-configs with one Cabal config file per configuration.

    Each file defines one GHC version and possibly a stackage package I want to support, and hence test against:

    $ ls ci-configs/
    ghc-8.10.7.config  ghc-9.2.4.config  ghc-9.4.4.config      stackage-nightly.config
    ghc-9.0.2.config   ghc-9.2.5.config  stackage-20.5.config
    

    These files look as follows:

    $ cat ci-configs/ghc-8.10.7.config 
    import: cabal.project
    active-repositories: hackage.haskell.org:merge
    index-state: hackage.haskell.org 2022-12-21T10:40:48Z
    with-compiler: ghc-8.10.7
    
    $ cat ci-configs/stackage-20.5.config 
    import: cabal.project
    active-repositories: hackage.haskell.org:merge
    index-state: hackage.haskell.org 2023-01-05T07:18:37Z
    import: https://www.stackage.org/lts-20.5/cabal.config
    with-compiler: ghc-9.2.5
    

    By pinning the index-state, the subsequent package resolution should be deterministic.

  • The CI setup dynamically creates one build job for each of these files. No need to edit the Github workflow definition under normal circumstances, even if you add support for new GHC versions.

    Instead, just drop a new file in that directory, delete some, bump the index-state time stamp etc.

  • The build plan of all these jobs are combined, passed to cabal-plan-bounds and the .cabal file is updated.

  • If the .cabal file is unchanged, all is well, and CI is green, else it complains.

    When it complains, it shows a diff of the .cabal file which you can apply manually. Or you set the update-bounds label on the Pull Request, and CI will simply push the fixed .cabal file to the branch.

This is how it will look like on Github:

And here is the auto-generated change to the .cabal file
commit 3e7e17d4034c40e05c841aea64a3ea26c9016fb3 (origin/pr/22)
Author: nomeata <nomeata@users.noreply.github.com>
Date:   Thu Jan 5 14:38:31 2023 +0000

    Update cabal bounds based on build plans

diff --git a/candid.cabal b/candid.cabal
index 21a07d6..1b05e65 100644
--- a/candid.cabal
+++ b/candid.cabal
@@ -47,42 +47,42 @@ library
     default-language: Haskell2010
     ghc-options:      -Wall -Wno-name-shadowing
     build-depends:
-        base >=4.12 && <5,
-        text >=1.2.3.1 && <2.1,
-        dlist >=0.8.0.8 && <1.1,
-        vector >=0.12.1.2 && <0.14,
-        bytestring >=0.10.8.2 && <0.12,
-        mtl >=2.2.2 && <2.4,
-        transformers >=0.5.6.2 && <0.7,
-        hex-text >=0.1.0.0 && <0.2,
-        crc >=0.1.0.0 && <0.2,
-        megaparsec >=8 && <9.4,
-        parser-combinators >=1.2 && <1.4,
-        scientific >=0.3.6.2 && <0.4,
-        cereal >=0.5.8.1 && <0.6,
-        leb128-cereal ==1.2.*,
-        containers >=0.6.0.1 && <0.7,
-        unordered-containers >=0.2.10.0 && <0.3,
-        row-types > 1.0.0.0 && < 1.1,
-        constraints >=0.12 && <0.14,
-        prettyprinter >=1.7 && <1.8,
-        template-haskell,
-        base32 >=0.1.1.2 && <0.3,
-        split >=0.2.3.4 && <0.3,
-        file-embed
+        base ^>=4.14.3.0 || ^>=4.15.1.0 || ^>=4.16.3.0 || ^>=4.17.0.0,
+        text ^>=1.2.4.1 || ^>=2.0.1,
+        dlist ^>=1.0,
+        vector ^>=0.12.3.1 || ^>=0.13.0.0,
+        bytestring ^>=0.10.12.0 || ^>=0.11.3.1,
+        mtl ^>=2.2.2,
+        transformers ^>=0.5.6.2,
+        hex-text ^>=0.1.0.7,
+        crc ^>=0.1.1.1,
+        megaparsec ^>=9.2.1 || ^>=9.3.0,
+        parser-combinators ^>=1.3.0,
+        scientific ^>=0.3.7.0,
+        cereal ^>=0.5.8.3,
+        leb128-cereal ^>=1.2,
+        containers ^>=0.6.4.1,
+        unordered-containers ^>=0.2.19.1,
+        row-types ^>=1.0.1.2,
+        constraints ^>=0.13.4,
+        prettyprinter ^>=1.7.1,
+        template-haskell ^>=2.16.0.0 || ^>=2.17.0.0 || ^>=2.18.0.0 || ^>=2.19.0.0,
+        base32 ^>=0.2.2.0,
+        split ^>=0.2.3.5,
+        file-embed ^>=0.0.15.0
 
 executable hcandid
     main-is:          hcandid.hs
     default-language: Haskell2010
     ghc-options:      -Wall -Wno-name-shadowing
     build-depends:
-        base ==4.*,
+        base ^>=4.14.3.0 || ^>=4.15.1.0 || ^>=4.16.3.0 || ^>=4.17.0.0,
         candid,
-        optparse-applicative >=0.15.1.0 && <0.18,
-        text >=1.2.3.1 && <2.1,
-        bytestring >=0.10.8.2 && <0.12,
-        hex-text >=0.1.0.0 && <0.2,
-        prettyprinter >=1.6.2 && <1.8
+        optparse-applicative ^>=0.17.0.0,
+        text ^>=1.2.4.1 || ^>=2.0.1,
+        bytestring ^>=0.10.12.0 || ^>=0.11.3.1,
+        hex-text ^>=0.1.0.7,
+        prettyprinter ^>=1.7.1
 
 test-suite test
     type:             exitcode-stdio-1.0
@@ -96,23 +96,23 @@ test-suite test
     default-language: Haskell2010
     ghc-options:      -Wall -Wno-name-shadowing -rtsopts
     build-depends:
-        base ==4.*,
-        tasty >=0.7 && <1.5,
-        tasty-hunit >=0.10.0.2 && <0.11,
-        tasty-smallcheck >=0.8.1 && <0.9,
-        tasty-quickcheck >=0.10 && <0.11,
-        tasty-rerun >=1.1.17 && <1.2,
-        smallcheck >=1.2 && <1.3,
+        base ^>=4.14.3.0 || ^>=4.15.1.0 || ^>=4.16.3.0 || ^>=4.17.0.0,
+        tasty ^>=1.4.3,
+        tasty-hunit ^>=0.10.0.3,
+        tasty-smallcheck ^>=0.8.2,
+        tasty-quickcheck ^>=0.10.2,
+        tasty-rerun ^>=1.1.18,
+        smallcheck ^>=1.2.1,
         candid,
-        bytestring >=0.10.8.2 && <0.12,
-        text >=1.2.3.1 && <2.1,
-        vector >=0.12.1.2 && <0.14,
-        prettyprinter >=1.6.2 && <1.8,
-        unordered-containers >=0.2.10.0 && <0.3,
-        row-types > 1.0.0.0 && < 1.1,
-        directory >=1.3.3.0 && <1.4,
-        filepath >=1.4.2.1 && <1.5,
-        template-haskell
+        bytestring ^>=0.10.12.0 || ^>=0.11.3.1,
+        text ^>=1.2.4.1 || ^>=2.0.1,
+        vector ^>=0.12.3.1 || ^>=0.13.0.0,
+        prettyprinter ^>=1.7.1,
+        unordered-containers ^>=0.2.19.1,
+        row-types ^>=1.0.1.2,
+        directory ^>=1.3.6.0,
+        filepath ^>=1.4.2.1,
+        template-haskell ^>=2.16.0.0 || ^>=2.17.0.0 || ^>=2.18.0.0 || ^>=2.19.0.0
 
 test-suite doctest
     type:             exitcode-stdio-1.0
@@ -120,12 +120,12 @@ test-suite doctest
     default-language: Haskell2010
     ghc-options:      -threaded
     build-depends:
-        base ==4.*,
+        base ^>=4.14.3.0 || ^>=4.15.1.0 || ^>=4.16.3.0 || ^>=4.17.0.0,
         candid,
-        doctest,
-        row-types > 1.0.0.0 && < 1.1,
-        leb128-cereal ==1.2.*,
-        prettyprinter >=1.6.2 && <1.8
+        doctest ^>=0.20.1,
+        row-types ^>=1.0.1.2,
+        leb128-cereal ^>=1.2,
+        prettyprinter ^>=1.7.1
 
 source-repository head
   type:     git

Looks silly doing that kind of change by hand, doesn’t it?

The commented workflow file

So how does it work? You can have a look at the pull request against candid introducing this setup, or let me simply walk through the .github/workflows/test.yaml file here:

Header

name: Haskell CI
on:
  push:
    branches: [master]
  pull_request:
    types: [opened, reopened, synchronize, labeled]

We want to react also on new labels on a PR, so we subscribe to the labeled event type as well.

Enumerating the build configuration

We need a first job to list the files in ci-configs and make that available, as a JSON array and via a job output, to the subsequent job:

jobs:

  enumerate:
    name: Enumerate CI configurations
    runs-on: ubuntu-latest
    outputs:
      configs: ${{ steps.enumerate.outputs.configs }}

    steps:
    - uses: actions/checkout@v3
    - id: enumerate
      run: |
        ls -1 ci-configs/| grep -o -P '.*(?=.config)'| jq -n -R -c '[inputs]' | tee configs.json
        echo "configs=$(cat configs.json)" >> $GITHUB_OUTPUT

The build matrix

We use this output in the next job build , which needs to depend on enumerate, and uses that output in the build matrix. This way, we get one job for every file in ci-configs/:

  build:
    needs: enumerate
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        plan: ${{ fromJSON(needs.enumerate.outputs.configs) }}

Setting up the build cache

Haskell on Github Actions is no fun without caching the Cabal store. We key the store on the build configuration (else we run into concurrency issues):

    steps:
    - uses: actions/checkout@v3

    - name: cache cabal store
      uses: actions/cache@v3
      with:
        key: cabal-store-${{ runner.os }}-${{ matrix.plan }}-${{ github.sha }}
        path: ~/.cabal/store
        restore-keys: cabal-store-${{ runner.os }}-${{ matrix.plan }}-

Derive GHC version from the cabal config

A key convenience here is that we don’t have to explicitly configure the GHC version to use, as we can find that data in the ci-config/*.config file. I use grep to get it out; not very robust, but simple enough.

(I find it more and more strange that cabal cannot simply treat the compile like the other dependencies of my project: With a versioned dependency in the .cabal file, and fetched on demand. I believe it would simplify quite a few things.)

    - name: Detect GHC version
      run: |
        ver=$(grep -o -P '(?<=with-compiler: ghc-).*' ci-configs/${{ matrix.plan }}.config)
        echo "Detected compiler version $ver in ci-configs/${{ matrix.plan }}.config"
        echo "ghc_version=$ver" >> $GITHUB_ENV

    - uses: haskell/actions/setup@v2
      with:
        ghc-version: ${{ env.ghc_version }}

Build and test, as usual

Finally, a normal build step. The only difference to what you might have in your project is the --project-file flag:

    - name: Build
      run: |
        cabal build --project-file "ci-configs/${{ matrix.plan }}.config"
        cabal test --project-file "ci-configs/${{ matrix.plan }}.config"

Store the build plan

Finally, we need to get our hands on the plan.json and store it as an artifact. Very conveniently, uploading Github artifacts from multiple jobs in the same workflow will just accumulate.

Storing the build plan as an artifact can be independently useful, to debug problems, as the build log does not necessarily tell you everything about the used versions.

    - run: mv dist-newstyle/cache/plan.json plan-${{ matrix.plan }}.json

    - name: Upload build plan as artifact
      uses: actions/upload-artifact@v3
      with:
        name: plans
        path: plan-${{ matrix.plan }}.json

Check and update the version bounds

And finally the special sauce: We use the build plans from above to check and the .cabal file!

I make static builds of cabal-plan-bounds, which we can download and unpack here. If this tool becomes more popular, we could bundle these three lines in a dedicated Github Action, but for now curl works fine. You can (and maybe should) pin a specific version rather than using latest.

  bounds:
    runs-on: ubuntu-latest
    name: Calculate cabal bounds
    needs:
    - build
    steps:
    - uses: actions/checkout@v3

    - name: Fetch cabal-plan-bounds
      run: |
        curl -L https://github.com/nomeata/cabal-plan-bounds/releases/latest/download/cabal-plan-bounds.linux.gz | gunzip  > /usr/local/bin/cabal-plan-bounds
        chmod +x /usr/local/bin/cabal-plan-bounds

The plans can be downloaded as an artifact

    - name: Load plans
      uses: actions/download-artifact@v3
      with:
        name: plans
        path: plans

Now we run the tool on the plans, and use git diff to show possible changes performed:

    - run: cabal-plan-bounds plans/*.json -c candid.cabal

    - run: git diff candid.cabal

Auto-update on demand

Few things more annyong that having to do manually something because CI complains, when CI could just do them for you. Therefore, if the update-bounds label is set on the PR, simply push the new .cabal file to the repo:

    - name: Push updated .cabal file if labeled update-bounds
      if: contains(github.event.pull_request.labels.*.name, 'update-bounds')
      uses: stefanzweifel/git-auto-commit-action@v4
      with:
        commit_message: Update cabal bounds based on build plans
        file_pattern: '*.cabal'

Else fail

If there was a change (and we did not push that change), fail the build, as it means that the Cabal bounds are not as they should be:


    - name: Fail if .cabal file was changed
      run: git diff-files --quiet candid.cabal|| exit 1

Can I haz that too?

Yes you can! Just follow that example, let me know if cabal-plan-bounds is not serving you well, and spread the word if it does.

This setup can need more refinement (automatic regular bumping of index-state, and even more convenience around changing the bounds) – feel free to share your ideas here.

16 Likes

There was more interesting discussion over in

and I realized that the CI setup above contains two ideas that are pretty much orthogonal:

  • How the build matrix is generated.

    Here, I am using a directory of .cabal config files (which can specify GHC versions, constrain dependencies, import Stackage package sets).

  • That the plans from each build are collected and compared to the .cabal dependencies using cabal-plan-bounds.

But of course you can use the former without the latter, if you like that way of specifying your build matrix.

And you can use the latter without the former, if you create your build matrix differently – statically in .github/workflows, dynamically from the tested-with field in your Cabal file (using @Kleidukos’s get-tested) , using cabal.haskell-ci constraint sets, or any other way.

4 Likes