The ^>= operator in .cabal files

Yes, this is another feature that’s necessary to implement: asking cabal “what is the latest version of GHC you can find a build plan for?”

That can’t be too hard: instead of pinning base (and boot library) versions, pass it to the solver without bounds, then extract the base version it comes up with, then map that to a GHC version.

Also see 'ghcup satisfy' command · Issue #109 · haskell/ghcup-hs · GitHub

1 Like

I know you are repying to Julian, from my pov using an older GHC is not a problem.

I think a POC should not be too hard but I believe the cabal-install codebase has to go through some changes before we can use it this way (the ghcup ticket you mention links to haskell/cabal#6885 which I believe is relevant).

E.g. pre-installed boot packages always exist and their version is pinned (because they cannot be reinstalled anyway), so you cannot leave it “floating”.

I did try --allow-boot-library-installs but I am not sure what the error is telling me:

❯ cabal build --dry-run --allow-newer=aeson:base --allow-boot-library-installs -v3 -w ghc-9.4
...
Component graph for ghc-bignum-1.3:
Component graph for ghc-prim-0.7.0: component lib
component ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3
    include rts-1.0.2
unit ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3
    include rts-1.0.2
    GHC.CString=ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3:GHC.CString,GHC.Classes=ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3:GHC.Classes,GHC.Debug=ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3:GHC.Debug,GHC.IntWord64=ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3:GHC.IntWord64,GHC.Magic=ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3:GHC.Magic,GHC.Prim.Exception=ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3:GHC.Prim.Exception,GHC.Prim.Ext=ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3:GHC.Prim.Ext,GHC.Prim.Panic=ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3:GHC.Prim.Panic,GHC.PrimopWrappers=ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3:GHC.PrimopWrappers,GHC.Tuple=ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3:GHC.Tuple,GHC.Types=ghc-prim-0.7.0-85190006ed57da9be8e03ab6d533fb5dbf69877426d00246e104507bd30abbc3:GHC.Types
Component graph for base-4.15.1.0: component lib
CallStack (from HasCallStack):
  withMetadata, called at src/Distribution/Simple/Utils.hs:368:14 in Cabal-3.10.1.0-inplace:Distribution.Simple.Utils
Error:
    Dependency on unbuildable library from ghc-bignum
    In the stanza 'library'
    In the package 'base-4.15.1.0'

PS: This is ignoring any external depdenency, which would add another set of problems.

Similarly – just ran into this trying to help someone.

Does anyone care to explain what constraints need to be added to make the following work?

cabal install hoogle-5.0.18.3 --constraint='crypton-x509 < 0.0'

Note that this runs into something mysterious with constructors appearing in warp's Transport type somewhere in the 3.3.x series.

To be precise: fixing http-conduit's bounds for aeson-2.2.0.0 necessitated revisions to 25 releases, which I made in my capacity as Hackage Trustee. This is why I support the current recommendation of “have upper bounds and raise them in response to new releases”: it is my experience that omitting upper bounds not only amplifies the amount of work required to respond to breakage, it tends to push that work onto people other than the package maintainers.

4 Likes

note that in the above, that constraint which I already added (giving part of the solution) is necessary (and took people quite some time to figure out – maybe an hour on irc of back-and-forth) because warp-tls itself does not have an upper-bound on the version of tls it uses (cf warp-tls: HTTP over TLS support for Warp via the TLS package) and so the older version does not bound itself to the cryptonite-using-tls and allows mixing-and-matching with the forked-crypton-using-tls. Once this situation occurs, trustees can, and probably should, figure out what revisions need to be made to avoid this sort of problem.

However, figuring out those revisions, as these examples show, is highly nontrivial, and involves a ton of prior releases all of which have missing bounds. Figuring out a bounds-bump revision, on the other hand, is very easy and rapid work.

see also can't install warp-tls.3.3.6 in macOS · Issue #936 · yesodweb/wai · GitHub for more on this current situation.

2 Likes

So you agree that hackage revision are a terrible hack and hackage should actually reject packages without upper bounds? This is one of the possible solutions, but status quo is broken.

1 Like

Part of the challenge here is that it’s easy to forget that just because you released a new version with a fix or better bounds that all the old, unbounded dependencies on your old releases are still in play. Each release with no upper-bound is some lingering debt. Unfortunately it’s easy to have a mindset that only the most recent release “matters”.

3 Likes

Yes, in fact both semver and PVP are terrible, because they encourage frequent API iteration and so you naturally end up with many actively used “branches”.

But that ship has sailed.

Sorry but I don’t understand where this negativity comes from. It’s not an “terrible hack”.

Even if Hackage rejected outright packages without upper bounds, the ability to adjust the bounds is what allow us to extend the life of a package.

A newer version of a dependency will eventually come up; and, as I have been told many times, there’s a good chance it will be still compatible after all. So what do we do?

Release a new minor version? (as I think most ecosystems do). Redistributing a single file is a much cheaper solution.

And who does that? The author or maintainer might be MIA or have just moved on. This is when a group of “repository curators” can step in.

(Needless to say: automated testing could make all this easier and less error prone. There have been cases of bad revision but they can be fixed quickly)

At some point some code change would be required to keep the package working with newer versions of its dependencies. At this point a new version is necessary, and if the author is unavailable that might be the end of the story (but there is a process for adopting a package).

I don’t want to state that this is perfect, but that, at least from some point of view, it makes sense.

E.g. I feel this supports perfectly the “I wrote some Haskell packages during my PhD” kind-of scenario: original author will disappear, content is valuable, we still want to use it even if very few people understand it.

8 Likes

Well, there are many reasons:

  1. It breaks an important invariant that package tarballs are self-contained (this matters for distros, tooling, mirrors, scripts, …). Interestingly… semver demands that this is true: “Once a versioned package has been released, the contents of that version MUST NOT be modified. Any modifications MUST be released as a new version.”. Instead… we’ve successfully converted a set of package tarballs into something you can only query correctly via an API that’s specific to the hackage infrastructure (the hackage index format is also ad-hoc and “internal” btw.).
  2. It burns out hackage trustees: this is easy to see from the interaction I’ve had with them, the turnover and the amount of panic my PVP PRs have caused.
  3. It gives hackage trustees “backdoor access” to anyones package metadata. Some users might not like that and it requires no rigorous process like the NMU.
  4. It’s a “let it break, then fix it” approach. But the fix is tighter constraints usually and not a code patch that improves compatibility.
  5. Hackage revisions can break freeze files that don’t pin the index. This has happened before, when a revision caused tighter upper bounds than necessary.
  6. There’s no automation about it at all. Everything relies on explicit communication and people doing manual labor.
2 Likes

The front page of Hackage says that packages may opt-out of curation, and provides a link to what curation means, including that trustees may help by revising metadata. The word “backdoor” carries the wrong connotation here, IMHO.

1 Like

The devil is in the details. The tarball is immutable and self-contained (and cryptographically signed). The hash of ghcup-0.1.19.2 hosted at https://hackage.haskell.org/package/ghcup-0.1.19.2/ghcup-0.1.19.2.tar.gz is

b25a15adaaca30a227ed12560d1d89924d9d3ae17fc5798f20f2f00484866088

and that will not ever change.

The “curation” part is a separate “service” built on top of the regular tarballs. Your use of the word converted is misleading, the tarballs are there. This is an interesting thread about the role of package metadata:

It is (kinda) documented and there are few packages to access it.

What’s ad-hoc in your definition? Is ghcup metadata format ad-hoc?

Techologically speaking, I agree the index format is a bit … “not-fancy”. I invite anyone interested in a re-design to open a thread to discuss. Personally I’d like take some ideas taken from stackage’s pantry. I once tried to define a canonical conversion to git (to subsume a half doznen ad-hoc options you can find on GitHub) but I never managed to finish running it because there are 171k entries in the index and my Python script could not cope :joy:

Burning out doesn’t seem to be exclusive to trustees, I bet they burn out just like everyone else :see_no_evil:

What breaks? My definition of breaking is when a package stops to compile. Speculative upper bound prevent that and relaxing them after manual verification is definitely not a "let it break, then fix it” approach.

We furiously agree here. Better tooling and more automation is sorely needed. I can do little but I know I am doing it.

2 Likes

I think I didn’t get the point across: the curation (which may be dire needed) depends on the infrastructure used, instead of correctly uploading a new version… now everyone downloading only the tarball may not get the fixed cabal files.

Uploading a proper new version is what really every other ecosystem does. It works well.

A package. It may stop to compile. With the current model… that’s what happens and then hackage trustees come along and fix it, but not the code, but the bounds.

I’m puzzled how this is deemed a good state of affairs.

Apologies… I digged it out and it’s actually the index cache that’s ad-hoc and implementation-defined.

Yes. There’s no schema.

1 Like

Just 17 more revision and cabal install hoogle works now :wink:

1 Like

Back to the topic, my preference would be to scrap this special semantics of ^>= and make ^>= X.Y a precise equivalent to >= X.Y && < X.Y+1. Version bounds are already hard, and introducing another layer of difficulty does not actually help anyone.

  • If I put a hard bound foo < X.Y and foo-X.Y appears to be incompatible, we are all set already. Otherwise if foo-X.Y appears to be compatible with my project, I can make a revision.

  • If I did not put an upper bound at all (YOLO) and foo-X.Y remains compatible, we can carry on. If it breaks things, alright, let’s slap a revision. It can save some work if foo is extra stable.

  • But if I used a soft bound foo ^>= X.Y-1, I must make a revision in both cases. Either foo-X.Y is compatible, in which case I should write foo ^>= X.Y-1 || ^>= X.Y; or it’s incompatible, in which case I must put foo >= X.Y-1 && < X.Y, because now I know that this is not a speculative bound. Guaranteed busywork in both scenarios!

See the similar sentiment in Drop the requirement of specifying upper bounds by hasufell · Pull Request #51 · haskell/pvp · GitHub.

4 Likes

I think that there should be a way to specify the intended meaning of caret such that this does not require a revision. There’s a monotonicity property we can use to guide this.

We are slowly moving into this direction: cabal check has recently started to warn on misssing upper bounds.

Those things are fine if there’s automation behind it (e.g. CI) and possibly a user interface that lets me click which bounds to update (not edit a 500LOC cabal file by hand in an online editor from the 90s).

Given that automation and user interface… not specifying defensive upper bounds might work the same way as well.

FWIW cabal should warn about this, but this should never become a hard failure during hackage uploads, if only because it’s trivial to bypass it e.g. by setting missing upper bounds to < 10000.

1 Like