Hello, here’s another weekly log!
Previously
One week ago, I wrote a prototype script that notarizes a GHC bindist. I will write a post about notarization in particular in the next day or two, so stay tuned for more on that topic.
After switching away from notarization, I took care of a few loose ends, and then I went back to characterizing spurious failures. This makes me happy, because the rate of spurious failures feels like it has gone up recently.
So far this week, I have looked into eight reported failures, which had 6 distinct causes:
- A Nix bug on aarch64
- A very rare case of ghc panicking
- ghc-pkg dying because a package conf file does not exist unexpectedly
- A rare, unexplained test failure
- Some kind of rare bizarre failure where the whole project directory disappeared while running a job
- Network timeout issues between runners and gitlab server
In support of this work, I have created a full-text search database of job logs in sqlite to explore error occurrences. (sqlite rocks.) I’ll try to make it accessible at some point.
Coming up
In the next few days, I will be updating my failure tracker based on these findings and continuing to characterize more reported failures.