About the cabal exact parser, there’s one standing partial solution right now.
This is, iirc, missing the ability to prune data after the main text? Morever, I’m not sure about its general reliability; the parser tends to allow a lot of invalid Cabal files through (I’m told barring invalid Cabal files isn’t an objective). I think the HLS project is looking for someone to complete it, and this might be the fastest way to get an exact-parser.
I have my own, but it’s sort of embarrassing (I can parse whitespace and comments, and everything else as “raw”). GitHub - liamzee/flatparse-cabal-exact: A flatparse-based library for exact-parsing cabal files.
As for the project itself, a very skilled Haskeller could probably push out a prototype in 25-50 hours, then test and refine it in another 25-50 hours; i.e, I wouldn’t consider the parser that challenging.
The bigger problem, rather, is the information in the Github thread; that is to say, the Cabal file format seems eccentric, and is defined by the parser’s behavior.
The Cabal codebase dates all the way back to 2003, at the very least, and it’s a bit hard to be onboarded onto. Actually building a high-quality parser requires going through a medium-large codebase, understanding how the parser is called, and how the parser works, and the Cabal megarepo itself is eccentric:
It seems that originally, the Cabal executable (cabal-install) didn’t exist as a separate package, and Cabal was usable on its own via Setup.hs. cabal-install is actually hooking into Cabal, just like Stack, and there’s a bit of kludge in doing so; it works, it works well, but the codebase is harder to understand as a consequence.
The fastest way to do it would be just to have someone familiar with the Cabal codebase and moreover, the Cabal parser, rig something up in a few weeks.
The second fastest way to do it would be to find a more skilled Haskeller than me to build the exact-parser, possibly based off VeryMilkyJoe’s version; I think it might take a month at most to understand the Cabal parser in its native context, understand it, then a couple of days of work to build the parser, two weeks, tops.
However, I’m still going to try to do it as a personal challenge. As I am seeking medical care right now, I have a lot of spare time, and suspect I might be able to get something decent by February at the latest.
If someone beats me to the exact-parser, I’ll still hopefully understand the Cabal codebase by then and be able to help improve documentation for it.