Haskell Foundation DevOps Weekly Update, 2023-02-01

Hello, here’s another weekly log!

Previously

One week ago, I wrote a prototype script that notarizes a GHC bindist. I will write a post about notarization in particular in the next day or two, so stay tuned for more on that topic.

After switching away from notarization, I took care of a few loose ends, and then I went back to characterizing spurious failures. This makes me happy, because the rate of spurious failures feels like it has gone up recently.

So far this week, I have looked into eight reported failures, which had 6 distinct causes:

  • A Nix bug on aarch64
  • A very rare case of ghc panicking
  • ghc-pkg dying because a package conf file does not exist unexpectedly
  • A rare, unexplained test failure
  • Some kind of rare bizarre failure where the whole project directory disappeared while running a job
  • Network timeout issues between runners and gitlab server

In support of this work, I have created a full-text search database of job logs in sqlite to explore error occurrences. (sqlite rocks.) I’ll try to make it accessible at some point.

Coming up

In the next few days, I will be updating my failure tracker based on these findings and continuing to characterize more reported failures.

4 Likes