DevOps Monthly Log, April 2025

Hello, welcome to the next monthly log.

April was dominated by the migration from Equinix to OpenCape. Although I was directly responsible for moving Stackage, I also helped troubleshoot issues with downloads.haskell.org. We have a lot of DNS records. Some of them are outdated legacy records that need to be cleaned up. Others are part of the machinery that enables caching for sites like downloads.haskell.org. Still others are used by maintainers to upload the downloads. Long story short, I figured out why people were having trouble uploading files, and Ben fixed it.

The Stackage migration was my worst migration yet. Stackage.org ended up being unavailable for 11 hours. I misconfigured the new server and it had no filesystem caching. Disk operations took thousands of times longer than normal. I still don’t know what happened next, but somehow this atrocious performance caused the website to become unresponsive. My best guess is that there was an asynchronous IO exception that got “handled” somewhere it shouldn’t have. Rather than crashing, the application just stopped accepting new requests.

I had tried to get a lot done on the day of the migration, and went to bed not long after “finishing” it. I didn’t notice the issue until I woke up the next morning. That’s why it took so long to recover.

I was first able to mitigate the issue by restarting the application, and then by adding CDN caching to reduce load on the server. Finally, after the May 1 holiday, I followed my strongest hunch about the performance problem and immediately found the filesystem misconfiguration.

Luckily, Stack continued to operate just fine during the outage. Stackage curators were able to successfully create new snapshots, too. Now that I’ve fixed the performance problem, Stackage is running better than ever. Best of all, I completed the migration before the April 30 deadline, even though a communication mishap left me in the dark about the severity of this deadline until a week before it transpired. (The old server even got recycled immediately on that day, which was another delightful surprise I discovered after the fact.) April ended with a bang!


The Haskell Foundation’s financial situation continues to be the major roadblock to my work. Luckily, a lot of infrastructure work is already handled by volunteers who have been here a lot longer than I have. Well-Typed also sponsors the work with Ben Gamari’s time. But there’s a lot of work not getting done. Of course, every open-source project is getting crunched right now (OSU OSL’s story is the latest I’ve heard). Please do what you can, wherever you can. Thanks, and see you next time!

10 Likes