As promised, I migrated Stackage.org to a new server yesterday. Unfortunately, some time afterward, the server process froze and remained unresponsive. I didn’t learn about the problem until the next morning (obviously a problem by itself), at which point I was able to restart the process.
It froze again 15 minutes later.
Acting on a hunch, I improved the caching (Cloudflare CDN) to reduce server load. The site has been running without issue for three hours now, which hopefully means the acute incident is resolved.
I still don’t know what went wrong. The server process was still running, but was returning some kind of error response with no log output. I will work to understand the root cause and also find ways to mitigate the issue faster should it happen again.
If you’re interested in helping out with this community resource, please bring your ideas to one of these issues: