So, I want to get the source code of all packages on Hackage at once, and ideally also all the metadata like number of downloads and reverse dependencies. Is there some handy tool that would do this for me? And how much would it weigh in the end — 1GB, 10GB, 100GB?
Running a local instance of Hackage is not my goal at this time, though it would be nice to have.
These mirrors basically only contain sourcecode and package-relevant metadata, not the information for a UI, so it is very close to what you want.
Note that the “package relevant metadata” (in the form of cabal files and revisions) also already exists on your machine, in the form of the 01-index.tar which cabal fetches from hackage.
That said, neither this nor any other mirror tool I am aware of will fetch or clone download counts. Those would need to be scraped separately.