DevOps Weekly Log, 2023-10-04

Hello, welcome to the weekly log.

It finally happened: I forgot to write a log last week and ended up skipping the week altogether. So, this log covers two weeks.

Releasing GHC 9.6.3 and following up on that work took the bulk of my time. The rest of it has been spent dealing with CI, as befits my role.

I volunteered to make a release in order to get first-hand knowledge of the process. Although a lot of the release process is automated, there is plenty of room for improvement. This is the kind of work my role was created to do. I am still writing up my notes and recommendations. A lot of stuff that would normally go into a weekly log is going into that, instead. As a matter of fact, I am still working on the writeup, or would be, except…

GHC CI is kind of falling over right now. The basic reason is that resourcing for GHC maintenance is a little low. Plus, there is no one monolithic “GHC CI”: since a lot of maintenance work has gone towards producing releases recently, the release CI pipeline has gotten plenty of use and is mostly working just fine. The others, not so much. I have it in my lists to think about how to have fewer overall pipelines so that a fix in one pipeline will benefit the others.

I have spent all day today so far looking into emergent CI issues, and a few hours every day of the last few days. To list some examples, I have worked around an issue with Darwin runners, kept tabs on build failures, and a looked at a couple Cabal CI issues (1, 2).

These issues highlight the need for more GHC resourcing. CI maintenance was on hold while the releases were being produced. There are less GHC resources available overall right now, exacerbating the problem.

The tone of this log might feel a little grim, but I’m actually not worried. The whole software industry is tightening its belt right now, so things are just gonna naturally slow down a little bit. I think now is a great time to build some momentum by gradually removing the impediments that affect Haskell programmers, and that’s what I’m gonna keep doing.

See you next time!

9 Likes

The MR template currently contains the following snippet:

By default a minimal validation pipeline is run on each merge request, the ~full-ci
label can be applied to perform additional validation checks if your MR affects a more
unusual configuration.

I know that there also is the fast-ci tag, which should be even more minimal. But there are only 13 MRs currently open which use this tag, so maybe not everyone even knows about this option. Maybe just mentioning this flag in the MR template could save a little bit of resources for very simple patches and make a small dent.

Turns out that’s a bug in the template, since some time in the last month or two “fast-ci” was made the default for merge requests Now there’s a “full-ci” that makes it go the other way.

1 Like

Ah, I see that this was done in !10907: ci: Make "fast-ci" the default validate configuration · Merge requests · Glasgow Haskell Compiler / GHC · GitLab. I think the old fast-cilabel used even less platforms, just 3 instead of the current 5 which are used by default.

1 Like

Thanks for the update @chreekat!

What is “GHC resourcing” exactly ? People ? Money for cloud services ?

It’s people. Of course there will probably always be more work than available personnel can manage, but it’s particularly acute right now.

1 Like