…?
Here’s another disclaimer - we shouldn’t expect some “merry band” of unpaid volunteers to suddenly appear and “save the day”. Perhaps they read comments like this:
and were then told that GHC and Cabal were “intertwined”.
…?
Here’s another disclaimer - we shouldn’t expect some “merry band” of unpaid volunteers to suddenly appear and “save the day”. Perhaps they read comments like this:
and were then told that GHC and Cabal were “intertwined”.
It would mean less potentially-“bugged” versions that need to be repaired post-release, therefore less “garbage out”.
Oh, you may have to recalibrate your expectations. Haskell is not really used in the release process. The tooling of release engineering is its own ecosystem and practices. YAML & Gitlab-CI are the hammer and nail for the job. Build artefact delivery is not a problem solved by pure lazy functional programming, nor should it be approached as such.
Well, I didn’t make claims about what language our systems are currently implemented in! But we do have the world’s most advanced popularly-used language at our fingertips, so surely we can make inroads into this problem, and yes, build artefact delivery really ought to be a problem that is amenable to a solution in such a language.
It know it’s not easy. At Groq we have a team on the order of size 10 working full time (including in Haskell and Nix) to solve such problems. But I can’t believe it can’t be done.
Making major releases is not the hard part, backporting bugfixes from master and making minor releases for each major series in a timely fashion is.
Aha! So maybe you’re talking about number of supported releases rather than releases per se? Now, there may be an implicit assumption than there’s no point having a release if it’s not supported. But this does seem to clarify something, which is that @hasufell and @bodigrim seem to be talking about a different issue.
Yes that is what is being talked about in the ticket:
Have 2 (soon 3) active series and having to double (soon triple) the effort of backporting bug fixes.
I believe there is. The direction–in my opinion–is simplify and streamline. I’m cautious to claim though, before being able to demonstrate, that this is indeed true.
We are exploring some ideas around this on GitHub - stable-haskell/ghc: Glasgow Haskell Compiler. Whether or not this will be successful, only time will tell. And yes, the idea is to (try to) upstream all (to us) successful changes.
Well, GHC doesn’t have 10 full time devops engineers.
A GHC release for me alone means:
Can some of this be automated? Sure, but automation itself is code too and requires maintenance as well. Unless someone is going to pay me for this or GHC starts pumping out releases on a weekly basis… there isn’t enough incentive.
Additionally, good QA always has manual steps.
And this is just my side of the story.
The whole discussion is a bit of a chicken and egg problem… yes, if we have better automation and better processes, we could have higher release frequency without imposing much work on the community. However, just increasing release frequency and then hope the community finds a way to deal with it and automate away their problems is a bit… bold? Maybe it’s better to start slow and increase carefully, while actively prioritizing good release processes.
We have made some improvements to the communication and other parts with GHC HQ: release-management.mkd · main · Glasgow Haskell Compiler / ghc-hq · GitLab
It’s good to be mindful about the social implication of releases.
In case I’m at risk of being misunderstood I think I should clarify a few things.
Firstly, the people who are doing the work know the most and are doing the work. They get the final say, of course, and I’ll be grateful for whatever comes out of it.
What I’m doing in this thread is sharing my concern about slowing down manual work, because my experience is that when manual work is done less frequently it can make things worse because the people doing it are less familiar with it. The specific dynamics are the the important thing though: my concern may not apply in this case. The people who are already involved know better than me, and ultimately I’ll defer to them, and trust them to make a good decision.
That said, I want to clarify that my point about having ~10 dev ops engineers was not that “therefore we should have 10 devops engineers working on Haskell”. Quite the opposite! It was to point out that I know it involves a lot of work, so my suggestion may be infeasible.
That said, Groq has 17m lines of code that I can see, and more millions that I can’t. I suspect GHC+Cabal+HLS+GHCUp is less than 2m, so an order of magnitude smaller. So, by codebase size at least, if Groq has ~10 then Haskell should be able to get by with ~1. The HF had one full time devops engineer (now part time) and he’s been doing a lot of great work. If we had the resources we could staff that function more and hopefully make more inroads into improving release quality and decreasing workload on volunteers.
I hope what I said didn’t give the impression that I am suggesting that. If it did please then say what exactly it was I said that gave that impression so I can communicate more clearly in future.
Well, fine: first decrease release cadence in line with community resources, then improve stability, automate workflows, find more volunteers, implement any other measures you see fit, then consider increasing release cadence again. While I appreciate calling out opportunities to improve, I’m afraid at this point they serve mostly as a distraction from acting on the issue.
Cadence must fit community. A younger (= less legacy code) or larger (= more hands and eyes) community could handle a faster cadence indeed. But we are not the one.
I agree generally with @michaelpj and @TeofilC that having fewer major releases will mean there are more serious bugs in major versions which are harder to debug since they will not be battle-tested for a long time.
It does appear that there is also a frustration about time taken to perform minor releases. This seems to stem in @arybczak’s case from the desire for specific bug fixes which haven’t been released yet. That is certainly annoying. What possible avenues would there to be to improve that?
Solution 1: Only very few companies are willing to contribute money to GHC development. So this does not seem feasible.
Solution 2a: This could be the most practical solution, but it would probably mean that releases like 9.6 would no longer be maintained and users forced to upgrade more quickly to a new release. It would definitely be easier for maintainers if there were less active branches.
Solution 2b: This goal does not seem very quantifiable nor achievable without additional maintainer time. It is also quite hypothetical, my impression is that much time is taken up by backporting and deciding whether patches should be backported.
In the past the pendulum swung the other way, for example, see this post from 2017 - Announcing the GHC DevOps Group - perhaps it is time to slow things down again. I don’t think there is a perfect solution in either direction.
On another note, I do not agree with the suggestion that the release cadence should be determined by the amount of volunteer work a few “super maintainers” would like to perform. It seems to me a failure in management of the ecosystem if only a few individuals have responsibility and power to adapt to new changes in the language. Any volunteer has the complete control to choose the level they participate, and no-one is asking any volunteer to participate at a level they are uncomfortable with. I have myself at times felt compelled or responsible as a volunteer, but now I realise that I didn’t have to participate if I didn’t want to, and other people would take up the tasks I left behind.
In general I feel like these decisions should be primarily driven by GHC developers, as I believe the ability of a project’s developers to self-determine and have influence over their workload is very important for the health and desirability of a project. Taking into account what people in the community think is of course very important to decisions such as this.
From Announcing the GHC DevOps Group:
Quality calendar-based releases every six months
As Ben Gamari documented in detail in his Reflections on GHC’s release schedule, actual GHC release dates are hard to predict and initial release quality is often low due to critical bugs.
So apparently the purpose of introducing CI back in 2017 was to establish some regular cadence for releases;
But critical bugs are still being found post-release with the current (six month) cadence.
Therefore this experiment with a six-month cadence has failed: at the moment, all CI seems to be doing is causing continuous irritation.
[…] users forced to upgrade more quickly to a new release.
…an annual release schedule would mean there are half as many releases - that ought to give users more time to upgrade. Moreover, Ben’s tabulation of pre-CI release (time) intervals reveals an approximate informal cadence of sum [12, 9.5, 6, 7, 19, 13, 14, 14] / 8
= 11.8125
(rounding up) twelve months - only once in that sample did a release occur in six months.
So unless we all want to have this conversation yet again in six month’s time: right now, switching to a twelve-month release cadence is the least-worst option.
This is fair. Of course no one here is trying to make decisions for GHC developers.
But I feel this kind of misses a point. If the release cadence is not aligned with community resources, then the result is:
This means that GHC releases are less well tested and may have more bugs simmering before they are detected. I think that effectively reduces release quality.
So for feedback loop reasons I think it might be in GHCs own interest. We don’t want a disengaged community that’s not excited about new GHC releases anymore, simply because the cadence is overwhelming.
Speaking for myself: fast cadence doesn’t make me less excited; buggy releases, incomplete release notes, and slow fixes / long-lasting broken windows make me less excited.
I don’t see how changing release cadence can directly improve your experience then.
Perhaps it would be more worthwhile to designate an LTS release, the maintenance and bug fixes of which have higher priority than even major releases? It doesn’t matter what is the LTS release as long as there is one. a rough guideline could be to produce a minor release within a month of knowing the fix for a certain critical bug, even if there are still fixes for other critical bugs in the pipeline.
Edit: I think that is also in line with what Julian proposed in GHC 9.12.2 is now available - #5 by hasufell.
I had a thought or two on this and wanted to propose those thoughts.
Ideally we want to have at most two LTS’s at once. 6 months seems far too short a time for a major release to settle down, so let’s aim for yearly releases, with two 3 year LTS’s. We would like to have a tail end of the release to have some additional time so that while the new release is easing in, the one that is being removed can still be used with the latest fixes. Additionally, a feature/change/non-fix freeze 2 months before a release means that we can tighten the nuts and bolts and gives plenty of time to get the release ready for all interested parties.
Below is my poorly rendered excel interpretation of this.
This gives us our primary, LTS, “A” and “B” channels, with a “C” channel to use every other release for a shorter term release. I chose a release month of September since that’s the start of the academic year, it’s after a summer of Haskelling, and (mostly) because it’s Haskell Curry’s birth and death month. Conveniently, this avoids the winter holidays as a hot spot for development, requiring only that the old release is EOL’d at some point during the winter. We could also have an alpha release (or a GHC X.Y.0 release or something) at the end of July as a sneak preview of sorts.
If and when we have more resources, we could have nightlies be the lead up to whatever the next release is, and then just make a cut at the end of June.
I’d really like to expand and debate these details. Getting everyone on the same or similar versions means we can hammer out more issues with those versions, while also encouraging more adoption of new features. My work only recently updated to 9.6.6, which is 3/4 major versions behind, as an example of ecosystem fracture. Stackage’s LTS is currently 9.8.4, which is 2 major versions behind.
To me this idea, often dubbed “tick tock releases” has always seemed plausible:
Advantages:
Ben wrote a HF proposal around this in May 2022, but my percepetion is that no one was particularly enthusiastic, I’m not sure why.
Perhaps it is time to revisit it. It would be good to hear from other users about whether this woudl be good, or bad, or don’t care.
Having a scan of the proposal, I can see the similarities. Personally I think 6-monthly full “releases” is possibly too quick, but having an on/off on LTS’s should keep the number of supported GHC versions down.
Another advantage of my proposed schedule is that we could have these LTS’s, but only start recommending them on ghcup and similar a year/6 months after their release, meaning that while much of the ecosystem can proactively update, we don’t recommend a release that isn’t widely supported, as well as allowing bugs to be found by those same proactive users.
We discussed a few ideas about LTS releases at ZuriHac last weekend, let me defer it to @jmct to share meeting minutes.
From my perspective the problem is that if some releases are designated as “tock” / “non-LTS”, there is not much incentive for the community to support them. For instance, I am likely to skip or postpone such work in my packages. It remains to be seen to what extent others will do the same, because after a certain threshold such GHC releases will be barely usable for anything practical, so one might ask why bother doing them at all. This risk is for GHC team to consider.
There is an unwritten expectation that maintainers upgrade their packages promptly once a new GHC is released. If such expectation is relaxed (e. g., we agree that maintainers are required to track only LTS or tick releases, or we agree that it’s fine for maintainers to do such work only once a year), I’d have no concerns about what GHC team sees fit for release cadence.