Monthly DevOps Logs

Hi all,

I will keep all my monthly updates under this one heading for now, as an experiment. As a reminder, I’ve switched to a longer interval as mentioned in The LAST DevOps weekly log, 2024-06-27 .

July 2024

My first month on the new schedule started with a bang, as an OpenSSH vulnerability demanded upgrades of every Linux system everywhere. This was accompanied by an unrelated expiration of a bunch of access tokens related to GHC CI. The latter gave me an opportunity to document how to create new ones! I also set up a calendar reminder so the next round of expiries won’t cause system failures.

Besides those adventures, I also spent time mentoring the GSoC contributor working on spuriobot. That will continue to be my focus in August.

4 Likes

Looked at spuriobot but there is no readme explaining the point of it. Could you add that or put a link to the gsoc proposal?

Yikes, yeah, thanks for reminding me. :slight_smile:

The GSoC project is based on my project idea found at Summer of Haskell - ideas.

The fact that spuriobot is currently found in a subdirectory of a repo that started out as a private workspace for my own work on spurious failures isn’t a good situation, lol. I intend to get it moved.

How is your “Estimated at 175 hours” going?

August 2024

In August, I continued mentoring the GSOC project, which can now be found on GitHub as well as GitLab. I also began working on setting up infrastructure to host the Haskell certification program (see José’s announcement, Haskell Certification Program). Finally, I took over responsibility for the two existing Windows runners used by GHC CI. I used my new console access to perform the hallowed “turn it off and turn it on again” ritual, to great effect.

In non-HF news, I have spent some time contributing to Snowdrift.coop, a project I previously worked on around 2014–2018. I also started working on a Rubik’s cube scrambler :upside_down_face:

Coming up, I’ll continue setting up the certification website, and with any extra time I’ll be looking into making CI runners more stable (particularly the Darwin runners, which need constant manual intervention right now).

8 Likes

September 2024

Amazingly, I spent September doing what I said I would: setting up the new certification website and contributing to GHC CI.

The certification website isn’t live yet, but is running on a temporary URL while I sort out backups and email-sending functionality.

The Darwin runners have been surprisingly well-behaved, and I haven’t touched them. But I did write a call to action for FreeBSD CI, and there is a work-in-progress patch that I encourage others to contribute to.

October should be much the same: finishing up the certification website and contributing to GHC CI. I will also be traveling to attend the GSoC Mentor Summit (Sunnyvale, California) and MuniHac.

5 Likes

October 2024

Hello, welcome to the next monthly log!

The certification website is still mostly done. I was able to gather some necessary information for setting up transactional emails. I also reprovisioned the server with ZFS so it could get added to the existing Haskell Infrastructure system backup scheme. In less technical terms, I laid the groundwork for one of the surprisingly-most-difficult features of any site, and enabled one of the surprisingly-most-overlooked reliability features as well.

Outside of purely technical work, I went to the GSoC summit, and it was a great experience! I discovered it’s effectively an open source unconference where every attendee happens to be a maintainer of some project. I also got to see (the outside of) a room that is officially labeled an “SRE Panic Room”, which I think is amusing and endearing.

Another attendee and I ran a session on CI, which ended up being a success. Too many CI engineers toil in darkness, down in the s**t mines, unaware that others are facing the same issues basically everywhere. There was some good information-sharing and emotional release. :smiley:

Shortly afterward, on the same whirlwind trip, I went to MuniHac. While there I successfully figured out what’s going on with random’s API, although I still find it hard to use. (I have a draft of a blog post about that. But I don’t have a blog, so…) Of course I also got to see a bunch of friends and have a good time talking about Haskell!

In less cheerful news, nobody besides me has contributed any code to the FreeBSD CI experiment. That means there is still no FreeBSD CI for GHC. Consequently, there are no bugfixes for the platform. The GHC team is a tightly-constrained community resource, and the lack of wider interest in FreeBSD means they should not put their own time into it while other priorities exist. I personally only know of 2 people who have expressed any concern for FreeBSD. (Surely there are more out there?) One of them is GHCup’s maintainer. GHCup considers missing FreeBSD support a blocker, and won’t update its ‘recommended’ version beyond GHC 9.4 until certain bugfixes are applied. So, if you like modern Haskell and you use FreeBSD, please consider contributing in some form or another!

November plans

In the short term, I don’t have any other development projects lined up after the certification service, so I plan to return to Stackage and follow up on some outstanding reliability concerns I have.

Plan for the role

Back in June, I wrote that I would continue in this one-day-a-week role at least through September. Now that it’s nearly November, I think it’s time for an update on that plan. :wink: Personally, I continue to be busy with intensive Finnish courses; I’ve done two in two months, and November and December will be filled with the final two. Adjusting to that routine took time. Coupled with the work I’ve done to clear a backlog of not-Haskell projects, I’ve been slow to look for and find other work. I will be doing more of that in November.

Overall I am less optimistic about finding contract work to augment this DevOps role. Therefore the work I am doing to improve reliability is geared towards setting up a successor and/or an all-volunteer infrastructure team for success. A systems thinker knows to “hope for the best and plan for the worst.” The market continues to be tough. There is less money for non-profits across the board. I’ve heard that GNOME, KDE, Python, and other well-established foundations are struggling with reduced sponsorships and increased costs. I do, however, hope to continue doing this work well into the future! There are still a number of avenues to pursue. Time will tell.

8 Likes

Apologies if I am overinterpreting, but this makes it sound like the bugs won’t get fixed unless anyone contributes. But as far as I understand, the only issue with FreeBSD is the issue fixed in MR 13276. And that MR is tagged for backporting to 9.6, along with 30 other issues (see the ‘all’ tab). Given that there is so many issues in 9.6.6, it seems like there is a decent chance that 9.6.7 will get released? Especially if we consider that 9.6 is prioritized over 9.8 by the GHC team.

1 Like

This is a result of FreeBSD being a Tier 2 platform:

So how can FreeBSD be promoted?

Hence the FreeBSD CI experiment:

No, there’s another, bigger issue.

To maintain release quality, release artifacts are only generated through CI. No manual process is used.[^1]

As long as there are no FreeBSD CI resources, there will be no FreeBSD releases.

The platform policy says, “[For Tier 2] we rely on community support for developing, testing, and building distributions.” In the past, third parties would build distribution packages. That doesn’t happen anymore. I can only assume it was a high-friction process. It makes more sense to require CI, even though that makes the platform policy a bit out of date.

[^1]: Although a worrying amount of the rest of the release process is still manual.

2 Likes