Today, the migration of the Haddock documentation generation program to live in the GHC repository has been finalised, with great help from @bgamari.
Why ?
Until now, Haddock was living in two places:
On GitHub, where some tickets and PRs were merged, with changes that were ideally not “GHC-specific”
On GitLab, and in particular on the ghc-head branch, which welcomed GHC-specific patches, and was synchronised with the GitHub repo.
The GitLab repository was then used by GHC as a git submodule, whose synchronisation also had to happen manually.
As most of the changes brought to Haddock are in fact GHC-specific, it made little sense to keep Haddock on GitHub, or even on GitLab with manual synchronisation steps.
This was bringing much confusion, as people would open tickets on GitHub regarding patches that would have been integrated on GitLab.
I am glad to say that this state of absolute madness is now over.
What do we get out of this?
The centralisation of Haddock, a tool that is heavily dependent on GHC’s internals, is going to reduce the feedback cycle for developers. This will improve the quality of each releases, and lower the administrative burden on contributors.
Won’t this prevent new people from contributing?
We indeed lose the network effects of GitHub, it is true, but I am quite happy to say that the GitLab administrators are more reactive than ever when it comes to activate new accounts.
For those who can attend, events like ZuriHac, MuniHac, and local Haskell meetups are also the opportunity for people to create their accounts on GitLab and start contributing.
For those who cannot, sending an email to ghc-devs ought to be enough.
So, welcome Haddock to the GHC repo! We will have great adventures together, and documentation will be a source of pride in our community and culture.
So Haddock is now more coupled to GHC’s internals,
Not really, it was simply moved into GHC’s git repo. The only difference is the workflow: now making simultaneous changes to GHC and Haddock no longer requires synchronizing submodules.
As a first time contributor to the GHC codebase, the merge gate that ghc has to be in lockstep with haddock was very bewildering and annoying. Super excited to rebase my branch and get rid of the submodule update!
Awesome! Although I’d love these things to be decoupled, the coupling was there, and reflecting it an accurate and non-confusing way in the codebase is great.
Since Haddock is no longer scattered across two repositories, can the entirety of Haddock now be separated out into its own codebase, perhaps via a combination of library and tools? It would be unfortunate if the boundary between GHC and Haddock ended up as being as “blurry” as the current boundary between GHC and Cabal:
to the point of requiring its own “big-picture” overview…
My guess is, because most changes are related to new syntactic extensions to the language. Haddock needs to understand how to parse the new syntax, how to attach its comments to the new syntactic constructs, and how to render it all as HTML.
One could imagine rewriting Haddock from scratch around a compiler-neutral API that handled all this, and making GHC its client. It might be a cleaner architecture but I’m not sure what its impact in practice would be. And anyway it’s not likely to happen.
So another (attempt at a) compiler or implementation-neutral API is optional - Haddock ought to be able to use what is provided by the ghc package, just as GHC’s driver program already does now.