Haskell Foundation DevOps Weekly Log, 2022-07-22

Hello again, welcome to week 10 of my devops log.

This week I wrapped up the script for backfilling CI failures, made some tweaks to the failure tracker, started corralling runners, and looked into a problem with marge-bot.

Details:

  • My backfill script was “done” last Friday, but this week I put some finishing touches on it, including concurrent fetches via monad-par.
    • The script is only in my private notes repo so far, but I’ve also put it in a snippet.
  • My CI failure dashboard now has a table of links, hooray.
  • I met with Ben Gamari to sync up on access to CI resources.
    • There are about a half-dozen different entities sponsoring a rather diverse set of computers.
      • I now have access to most of them. :key:
        • With this power, I will be able to start taking over maintenance or inspecting their failures.
  • marge-bot failed, requiring a manual restart. I think we can improve on that process, and I intend to do so next week.

So what else is up next?

  • Start automatically retrying jobs that fail spuriously
  • Fix or replace flaky CI runners
  • Open lines of communication with the responsible parties of (still inaccessible) CI resources
  • Follow up on other spurious failures

That’s all for now. Enjoy the weekend!

P.S. What is marge-bot? It’s… well, you can go read the project README. :wink: In short, it ensures that every merge results in a history of commits that all pass CI.

11 Likes

Good to see this slowly improving :muscle:

1 Like