[Author’s note: my previous monthly logs are all on one thread, Monthly DevOps Logs, and stretch back to August 2024. I’ve decided to go back to making new top-level posts for each log.]
Let’s see what my work journal has to say this month…
Ah yes, I created a diagram for the current Stackage deployment. I haven’t thought of a good place to publish it, but here’s a snapshot:
After that, I did more work preparing for migrating Stackage to its new home. I wrote a migration plan that will take away the guesswork and reduce the risk of forgetting steps — a standard strategy for any operational procedure.
The migration itself hasn’t started. I’m waiting on access to the new server. But in the meanwhile, I deployed a big refactor to Stackage.org that greatly reduces wasteful disk usage: Post Hoogle generation disk cleanup by chreekat · Pull Request #343 · commercialhaskell/stackage-server · GitHub
In the Cabal world, I had to do some work to recover from lost data. The release pipeline relied on container images that got “cleaned up”. The risk was that the Cabal 3.14 series would suddenly lose bindists for a particular operating system, Centos 7. Now, that OS should probably be dropped for new major releases, since it’s out of support. But a major series should try to support the same bindists for its whole lifetime. Luckily, the OS image was rebuilt by Matt Pickering, and I was able to patch Cabal’s pipeline to use it: Use a newer ci-images for centos7 by chreekat · Pull Request #10843 · haskell/cabal · GitHub
In ops land, I did some investigation into how Fastly is used, which allowed me to identify an issue in how Hackage was being cached that was causing errors for cabal users. I also dealt with a dead Windows CI machine.
In GHC CI, I started working on a patch to reduce wasteful disk usage in the ghcup-ci pipeline. This is a component used in GHC CI to test installing bindists with GHCup. In this scheme, the bindists that are created in GHC CI itself are immediately tested with this component. Unfortunately, they are tested in a pipeline that fails 100% of the time. Naturally, this means the ghcup-ci test itself has been failing for quite some time and presumably nobody has noticed and/or prioritized its fix. I am not blameless: I noticed this at least a month ago, but only now while writing this did I open a ticket.
Overall, this felt like a short month because I’ve been busy with other work these days. But I’m glad to see I still got a lot done in the short amount of time I have to spend on things. Which reminds me: the last thing I’ll mention is that I introduced a new donation system to the Haskell Foundation leadership, which has been put into use. You can find it on the HF sponsorship page. I’ve heard through contacts with other open source foundations that it a really effective tool, even if you could summarize it as “it replaces the Paypal button”. Maintaining infrastructure critical to the ecosystem is a never-ending task — want to help me have time to do more of it? You, too, can donate!