DevOps Weekly Log, 2024-04-17

Finally, another weekly log. Sorry for missing a few weeks!

The big update is that the Stackage handover is complete. This means that is now maintained by the Haskell Foundation! If you have issues or suggestions for the website, direct them to the stackage-server repository and I’ll get back to you.

Note that snapshot curation is still run by the Stackage curators. Their work is centered on the main Stackage repository. I provide hardware for them to use, but the curators still run the show. :slight_smile:

Other recently-completed tasks:

CI runners for GHC:

(The situation is not good here, but I have not had time to work on it much.)

  • I removed some work-in-progress jobs from the CI that had ceased to work at all, which removed some unnecessary timeouts.
  • A big problem we are facing is that our Darwin runners are unreliable. I have had some conversations about expanding capacity that I hope will bear fruit.
  • Also for Darwin, I started researching new ways of providing pristine, reliable platforms on which CI jobs can run.

Release CI for Cabal:

  • I finally finished updating CI to include all the Linux platforms (i.e., the container images) now available from GHC’s CI configuration.
  • The same patch also sped up most jobs by using the Haskell toolchain already present on the images instead of ignoring them.

Combating spurious failures in CI:

  • A few people submitted proposals for my Google Summer of Code idea. It would be a project to improve on my failure analysis tooling. I set up interviews with the proposers so that I can make recommendations to the GSoC admins.
  • While attempting to deploy some minor changes to spuriobot, I ran into some issues with our deployment tooling that took time to diagnose. (Diagnosis: we need tests and automation for our deployment tooling.)

Coming up:

Now that I’m reponsible for, I have a lot of followup work to do.

  • Document the system architecture.
  • Document the deployment implementation.
  • Attract volunteers to help maintain the infrastructure.
  • Write a presentation for the Haskell Ecosystem Workshop.
  • (My todo list has 43 items marked “important”, lol)

CI runners for GHC:

  • I really hope to start solving platform capacity issues. Not just for GHC, and not just for CI. Like—did you know Stackage snapshots only get built on Linux? Wouldn’t it be cool to build them on more platforms? Or—I think we could shrink the radius of the Haskell Upgrade Painwheel with more tooling. But these goals need compute power! So, have you or your employer considered sponsoring the Haskell Foundation?

Sorry this one is a bit longer. Unfortunately I wasn’t able to attend the GHC issue triage yesterday, so I can’t make it even longer. :slight_smile: But that’s really it for now. See ya!