Dependency version bounds are a lie

Very nice! I think you should be able to easily copy the second half of my suggestes workflow (collect build results and validate/update bounds) and combine it with yours.

Personally, I see some benefits to the directory-of-files approach, in particular that you can activate the configuration locally easily (just pass it to cabal using --project-file).

While I sympathize with the issue that this is trying to solve, I, as a nixpkgs maintainer, am quite afraid, that wide range adoption of this tool will actually make the ecosystem worse. The probability of finding a working build plan for different libraries (of which one might not have a very recent release, which is totally fine and very common) will just get harder when they constantly bump lower bounds. I think this might affect every cabal user who uses two libraries which don’t depend on each other, so yeah, probably every cabal user, and the burden on stackage and nixpkgs maintenance would increase a lot.

I really like the approach of trying to have wide ranges of bounds and test it with --prefer-oldest and --prefer-latest.

My prediction would be that widespread adoption of this scheme would decrease the ratio of non-broken Haskell packages in nixpkgs significantly.

I hope not!

I’d also like to test my packages against the package set as defined by nixpkgs (stable and unstable), I just need to figure out the best way of doing that (nixpkgs doesn’t by chance already generate a cabal config like stackage does)?

In addition, I think it goes well with this approach to have some jobs using --prefer-oldest or some other way to keep the range that’s tested large.

Sure works, Stash · NicolasT/landlock-hs@d7ba147 · GitHub

cabal-plan-bounds formats build-depends different than how cabal-fmt does it, though, hence several lines in the diff are not actual changes.

I am so excited for --prefer-oldest, i’ve been wanting that for a long time to verify lower bounds!! thank you!!

2 Likes

The problem with people testing against nixpkgs is, ironically, that it doesn’t help nixpkgs maintenance at all. We are kind of victims of our own success, when that happens. Projects which test (only) against current nixpkgs very often break in nixpkgs when we update nixpkgs, because they never had an incentive to fix any build errors with newer packages. They will only start testing against the new versions of their dependency, when we release a new version of nixpkgs, but at that point we already needed the fix.

That’s more about upper bounds, though, where my fear with cabal-plan-bounds is more about lower bounds. So it might not be that bad. Actually testing against the newest versions available on hackage, stackage-nightly and stackage-lts will probably bring us a long way.

1 Like

It probably wouldn’t be hard to create a nix derivation which generates a cabal config file like that.

Having CI which tries to figure out the loosest possible lower bounds and then autogenerating the bounds seems fine fore me. But then the bounds should really be convex.

IMHO, this is a case where downstream is pushing a problem they might have upstream.

The intent of something like cabal-plan-bounds is to ensure a package author who wishes to do so, can publish a package with (lower & upper) bounds set to something that’s known to work. Validated by CI or otherwise (manually), which in any case has a cost (CI may be somewhat-free’ish nowadays, but still), hence, the package author might decide to bump lower bounds because keeping them low (and tested) is costly when new upper bounds need to be supported.

Even if there’s no technical reason to bump said lower bounds, i.e., the package may still work with older versions of the dependency.

If downstream (a distribution, a company,…) wants to ship/use the package with older versions of dependencies, its on to them to validate things work properly, and apply bounds relaxations as appropriate (e.g., by applying a vendor patch during the package build process, or maybe using allow-older definitions in cabal.project).

Nice!

You can pass multiple cabal files to the tool in one invocation, then the plans are parsed only once.

Can you just run cabal-fmt after it?

:man_facepalming: Of course.

That’s what I meant! Add nixpkgs as an additional settings you use, to keep that lower bound alive and tested. It’s not perfect yet (maybe someone creates a tool that calculates a set of “full range covering build plans”…) but might work most of the time.

Thinking of it: haskell-ci could create build plans automatically (derive constraint-sets, I guess), based on bound definitions in build-depends, to ensure all bounds are validated (and, as a safety-check, have something like cabal-plan-bounds at the end).

3 Likes

I see your point. But downstream can also be another word for “user” here.

A new user would probably start shouting “cabal hell!!!” when they try to build a simple webserver and the sql and the http library don’t have a valid common build plan because neither of the upstreams tested them together. It’s not likely to happen, but certainly possible if maintainers test only always the newest configuration at time of release and only accept exactly those bounds.

One other thing: The biggest reason for upstream to maintain bounds instead of downstream is that the upstream has more information. Downstream can (and nixpkgs does) run the test suite as well. But only upstream knows about breaking changes which don‘t cause build and test failures and for those bounds are really important and downstream packages need to be careful with them. If upstream only creates the bounds based on builds and tests succeeding this becomes less relevant.

I think there is reasonable agreement here. Using tests to verify and create bounds is really cool. I just want to urge everyone to not forget that giving reasonably wide bounds is important. So if you make a CI setup which creates bounds you should keep that in mind. Especially I want to argue for checking upper and lower bounds and then assume that values in between are fine, I think that’s a good enough approximation of testing everything.

If we take the “only release tested bounds thing” seriously, we couldn’t reasonably create the bounds from more than one build plan at once because of the possible combinatorial explosion of valid build plans for bounds created from just two build plans. Accepting that, making bounds generally convex seems reasonable to me.

1 Like

I think I generally agree that it’s bad for a version bound to be wrong - in either direction. Either allowing a build plan that won’t work, or disallowing a build plan that would work. Both are really bad!

I deal with redundant upper bounds a lot. I occasionally deal with overly lax lower bounds. But if there’s a tool that’s automatically pushing version bounds on packages up, and up, and up, then that’s gonna make “overly strict lower bounds” go from being “a problem no one has right now” to “a problem everyone has to deal with occasionally”

1 Like

I fully agree that bounds are a service to the downstream user, including distributios (else I’d just use a freeze file), and that wider bounds are better here. And I’d say a tools like cabal-plan-bounds doesn’t change that - it just mean I have to do different things to achieve these bounds (e.g. think about the build plans I want to test, use --prefer-oldest to keep testing the lower bound, or simply pin the exact versions you want to keep tested in the cabal config).

Yes, that’s more work than just leaving stale lower bounds in place. But if the sanity of distro maintainers and other users so far relied on accidental, hope-for-the best stale lower bounds, then I think there can be room for improvement :slight_smile:

…at least for Haskell, continuous integration looks more like continuous irritation (or perhaps insanity). To someone like me who hasn’t built anything big in Haskell for a few years now, it just seems more and more excruciating.

2 Likes

I just use stack LTS-es… which were, I think, in part created because of things like this.

My libraries I build using stackage LTS-es, and I take the versions from the lowest LTS I support for lower bounds, and the highest for the higher bounds. And then I guess I sometimes have to adjust when new packages come out for the higher bounds, but other than that I haven’t really had any issues and none were raised in the GitHub issue section either :man_shrugging:

But I can see how YMMV and I sympathise with my fellow Haskellers who prefer cabal.

(Getting your package thrown out of stackage is my way of knowing when new updates to libraries have been published, and I, at the moment, don’t mind updating my libraries when that happens. But then again, I have like 2 that I maintain, so…)

3 Likes

…then most Haskell users switched to ghcup, stack, and others.

We currently encode positive information in the cabal file - these dependencies are (very likely) gonna work. Hence lower and upper bounds.

What would happen if we’d encode only negative information, i.e. only exclude versions that are semantically not compatible or are known to not compile? Presumably that means few upper bounds, and maybe fewer lower bounds as well? Who’d benefit, and for whom would it be worse?

It seems it would be much less busywork for maintainer, and from what @maralorn says it sounds it might also be better for packe set maintainers. It likely doesn’t matter for users of package sets. Would it be worse for users using hackage directly?

Maybe in that world semantic breaking changes that don’t break the build would be even worse, and maybe avoided more?

2 Likes

I think it would be basically an unmitigated disaster. In particular, “known not to compile” is a huge question – known not to compile by who? By maintainers who update those bounds? By anybody who attempts to compile? If the default assumption is that things will work, then the solver will find lots of plans that don’t work, and things will break constantly.

If, on the other hand, we had tooling that automatically tested nearly every config and then introduced bounds for everything based on whether it compiled or not, then that would not be so bad, but also, at that point the “gap” between “positive” and “negative” information would narrow almost entirely (leaving out semantic changes with no type reflection) because we would in fact have “perfect” information.