Hi there!
This December, I realized that state-of-the-art LLMs were quite capable of coding in Haskell, and even solving Advent of Code puzzles.
This was a surprise to me, as I had been using Claude Code quite intensely since July and hadn’t realized their “raw” capabilities, given I was using them on a complex (and messy) codebase.
That codebase wasn’t in Haskell btw, but in Elm, so close enough “in FP spirit”.
Anyhow, since then I’ve been wondering how I could evaluate different models myself, and not rely on advertised benchmarks, which could be more or less optimized for marketing purposes.
I’ve summarized in a blog post my strategy for doing so, and I feel pretty good about it for now.
It has a bit of a “homegrown” flavor to it, but quite simple and adaptable too, so I thought I should make it a blog post if it could be of any interest to others.
I’d be glad to hear what your thoughts and experiments on the subject. Also, I’m sure I could learn a thing or two by sharing my post. Here it is:
By the way, another thing I realized in December was that LLMs handle different languages/paradigms, even niche ones, just fine. In other words, the choice of a programming language seems to be less critical than I assumed. That was also a surprise to me, as I was convinced that:
- popular languages had to be “better”
- typed languages had to be “better”
By “better”, I mean better supported, and therefore producing better results
Both of my assumptions appear to be contradicted by such experiments. As for myself, I’m still interested in wielding the strictest/most powerful language. So if the LLM can help with that (think assisting with teammate onboarding), then I’d tend to think that LLMs have the potential to make Haskell more attractive than it’s been historically.
And yeah, I know about “avoid success at all costs”, but still ![]()
Side note, I may make this post evolve by testing other languages, just to satisfy my own curiosity. Also, I’d be interested to see if all models can complete all days, but I’d prefer to try that later, once I’ll have solved them all myself.