We’ve just updated our production code base to GHC 9.2.8 (lts-20.24) and we decided to try out the --nonmoving-gc option. Weird thing though, is that when pushing this to the test environment the running services (about 50% of them) started restarting every few hours.
This normally happens when they run out of memory: the platform notices failing health checks, kills the running service and starts a new one. But after checking the memory usage, it was at about 50% the allowed amount, and we were getting the following error prints just after failed health checks:
Control.AutoUpdate.mkAutoUpdate: worker thread exited with exception: thread blocked indefinitely in an MVar operation
(in our code, auto-update is a dependency of the logging code via monad-logger's dependency on fast-logger, and just the main workhorse warp)
Now, this never happens when running the services without the --nonmoving-gc option, so I’m wondering if anyone knows what’s going on?
We have noticed that the CPU usage doubles when using --nonmoving-gc, but I’m not sure if that is related in any way.
Does anyone have ideas, or advice on how to remedy this phenomenon? Is GHC 9.2 just a bit buggy w.r.t. the non-moving GC, and would using GHC 9.4 maybe help?
