New Hackage Server Features

After a pandemic-adjacent interregnum, hackage server is now again being maintained a bit more regularly with occasional feature updates and plans for the future.

In the current server release (which had to coincide with a migration to a new machine) there’s a number of smaller cosmetic improvements, as well as a few more noticeable features – largely stemming from deployment of the work in last summer’s GSOC. Here are a few things I wanted to highlight:

Candidates
The candidates feature still isn’t as polished as we’d like, but it is certainly in better shape – main package pages now list candidates, and users’ individual pages also include links to their package candidates. The candidates index page had been broken for some time and is now fixed. Candidates auto-delete as candidates on publish, and also the docbuilder runs on candidates! (However a nice UI for looking at candidate build-reports is still not implemented). Furthermore, the workflow for maintaining and releasing candidates is somewhat streamlined.

Builds and Badges
The docbuilder now runs tests and coverage as well when possible, and a package’s hackage page will display badges with status of tests as well as coverage and doc availability. Furthermore, now when you delete docs and reset the failure count, they will autorebuild – i.e. one need not request a manual reset of the docbuilder’s previously locally-maintained failure state.

Markdown
For rendering README.md and changelog files we now use John MacFarlane’s wonderful commonmark-extensions package to render markdown, allowing us to offer a fairly full set of so-called “github flavored markdown” including (sanitized) inline html, inline images, tables, github-style reference links, and some other niceities. Since many people author their md files first and foremost for display by github, this should be a nice improvement to existing display as well as open possibilities for package authors in the future.

Misc
For end-users (rather than bots) we also now have a better mechanism to attempt to always redirect http to https connections, which I know some people had been asking for for some time.

Final Note
The hackage server codebase is scary and there’s a plan on how to modernize it, although many details aren’t pinned down and there’s certainly no immediate timetable for when we’ll start work on it (or when it may finish). But even as is, there’s plenty of small low-hanging fruit to work on, which improves the user experience of all the many thousands of daily hackage users.

Interested people can take a look at the “good first issues” tag, or attempt to roll up their sleeves on plenty else as well. I’m around in the usual irc channels to lend a hand, and would be happy to help you get set up. Further, if you’d like to help in a more serious way, these issues are a place to just warm up a bit and get familiar.

Happy hacking, all!

21 Likes

Is the published source is the same as deployed?

I tried multiple times to hunt down that “install with cabal install” suggestion that pops up with every hackage link, but there’s nothing like that on github-published code.

Perhaps the page template was patched to include analytics tokens or something like that, but this begs the question of «what else is there we don’t have to see».

2 Likes

The published github of the central-server branch (not master) is precisely the same as deployed. This branch has certain tweaks to validation and page templates to make the server more suitable for use as the core instance.

I think the language you’re looking for is buried in a meta tag on a page template (so search doesn’t find it because there’s explicit nbsp characters instead of spaces). There’s a ticket to change it, which includes a link to the location. If you want to try to fix it up, please do! https://github.com/haskell/hackage-server/issues/912

1 Like

Ugh, it hides well from github indexer. I couldn’t find that with Install via and og:description queries.
Sorry for casting doubt.

Everything hides well from the GitHub indexer!

5 Likes

Or, that can be an effect of master being rather stale:

This branch is 145 commits ahead of master.

Master is not stale – its the active development branch. Those 145 commits are all the “tweaks” of the sort I mentioned – things which make sense for the main hackage server instance but not individual private ones. Prior to every hackage server deploy we merge master into central-server, then deploy the latter. I’m not sure if there’s a better workflow, or if we should just periodically rebase those tweaks or what (not really a git guru) but that’s the process…

1 Like

Hm… So, to patch out that cabal install (to use a concrete example) one would need to patch the “tweaks” branch?

It looks like some hackathon is in order to cherrypick all the possible tweaks into master and put in extension points for the remaining.

I think what @sclv said makes sense if you consider the fact that there is no “canonical” hackage server; you’re free to run one as you please. But for the one that sits at https://hackage.haskell.org, they may want their own set of “tweaks” to delineate that special application. That seems fine to me. Remember, Hackage Server is not a singular notion!

1 Like

But I don’t have “my” hackage server. I want to improve the one I use…

This is very cool, thank you to everyone involved in the release.

I don’t understand however why the docbuilder runs tests only … sometimes? E.g. in my rp-tree package, v 0.5 ran tests whereas 0.6 didn’t

Good question. Looks like we provide build logs for docs, but not logs for tests (though they are uploaded – it would be a nice PR to expose them). Looking on the docbuilder box, the test suite for rp-tree 0.6 failed because it could not execute hspec-discover. I suspect this is because the test suite requires cabal v2 builds and for various reasons due to some missing features in cabal, we can’t yet use v2-build for the docbuilder…

Can I ask professional in the field a Hackage API question.

a) Repology:

In the past I helped Repology.org & want help them to get information from Hackage in a proper efficient way for both parties.

Currently Repology.org downloads a chain of several tarballs of package versions with textual information, and so on and so on. And then parses it through manual custom code through Python. On Hackage changes, new files addition into the chain, or format changes & so - custom Python code gives “results may vary”.

What is the proper way to get from Hackage a full list of packages with their current recommended releases (those Cabal would advertise), aka return a list of latest && recommended version.

What would be the most right way for Repology.org to get the latest recommended versions of all packages through API/scalable means. Creator/maintainer of Repology has a custom system GitHub - repology/repology-updater: Repology backend service to update repository and package data that processes repositories.

b) Packdeps:

This same responce also would be helpful for https://packdeps.haskellers.com/ - resource is really useful for maintainers of packages to monitor updates of dependencies their packages use & so service allows to notify & faster propagate support & updates through Hackage:

It allows to subscribe to RSS with updates like:
https://packdeps.haskellers.com/feed?needle=hnix

But you see there - deprecated binary 0.10.0 is advertised (moreover binary 0.9 is deprecated also). So update notification of the real 0.8 branch are replaced with deprecated version “just hanging in there”. I also deprecated the last release of hnix & that release I deprecated - would still be advertised to people downstream by the service.

1.5y ago I opened report Please, respect Preferred versions in the web service - that allows automation · Issue #50 · snoyberg/packdeps · GitHub there.

Local CLI packdeps tool (because of the ability to execute ~/.local/bin/cabal locally) allows to account for recommended versions, but much more useful webservice - does not/does not know how to provide that - I imagine they do not want to run cabal for API requests, or something of that nature.

So: how, is it possible to receive recommended versions directly from Hackage?

c) Setups:

I imagine that ability would also allow having robust link URLs like: https://hackage.haskell.org/package/hnix/hnix-latest.tar.gz - but this last one is an idea to evaluation, if I would do POSIX installation - I can provide latest through GitHub CDN tarballs URL, but it can be seen, if someone uses something distributed, like darcs for development.

But that ability of tarball downloading also incompatible with package revisions.

But having a last recommended version in the badge - would be good.


P.S.

Also, I always was interested to have an info on how Hackage/CDN system is doing.

Hackage is fast, thank you.

1 Like

The manifests of all packages are at https://hackage.haskell.org/01-index.tar – it is in chronological order, so latest (which I suppose you would call “recommended”) are most recent. From skimming the current code it looks like its doing basically the right thing. Its not clear to me what any concrete issues it encounters may be. If it regards “preferred versions” and deprecated version: there is also a file that calculates valid version ranges from that information at https://hackage.haskell.org/packages/preferred-versions or individual info can be parsed from Preferred versions | Hackage

If you are checking fairly consistently, you can get a large list of recent uploads from Recent additions

I would lean against having urls like hnix-latest.tar.gz since that would be unstable and fragile – we always want to think about packages in conjunction with their versions.

Well, thank you.

Yes - it is what Repology is doing currently. The initial part of it.

It is a bit strange that to get the latest recommended versions from the server - it asks to download 01-index.tar which is 728MB of text files. Which is basically a full history of everything ever published. It needs to be fully downloaded, to be hashed & checksummed. The file name was changed a couple of years ago, the -index.tar was still there - but somehow incomplete, which was a pretty implicit change & I’ve seen how main maintainers of Nixpkgs were figuring-it-out & then I helped Repology to fix importer accordingly. Then the parser needs to be developed, so to parse that 728Mb filetree. Then matched with https://hackage.haskell.org/packages/preferred-versions, then get the ranges & do max on the ranges to find “recommended version”. But that is a bit implicit & excessive to get recommended version. That is why I asked it.

Because that process is complex - Nixpkgs has implemented it through cabal2nix.
Repology, because of that same complexity & working with a lot of repositories - not implemented it, so it advertises deprecated versions: haskell:binary packaging badges - Repology.
Btw, Repology as a massive repository API customer - is probably a nice place to ask about the design & features for API for a repository.

I agree that package-latest.tar.gz is unstable, for the same example - version can be published but then turned-out to have module name clashes & so deprecated & in that time somewhere latest got downloaded. Also as I pointed-out currently manifests do not include revisions - so that is why refuted latest URLs myself.

Thank you for your work. I know that you have a lot. It is not that I request this.

I understand there is a complex blurring between Hackage & Cabal functionality.

The 01-index.tar can be incrementally downloaded because it is append only and provides incremental merkle-hashes internally – this is the design of hackage-security. I’ll grant there’s no libraries in other languages that do this, but that’s the reason behind its design.

As a more general point – the reason there’s no api for “latest recommended package versions” as such is that this isn’t a particularly natural thing to ask – there’s really no “recommended” versions, just many versions, which the solver may pick any of, and some which are deprecated, and a rare few which are “preferred”, a very strange concept that was introduced mainly as a solver hack, and only used in specialized corner cases.

Since “whole repository” listings are expensive, the hackage api provides relatively few of them, and only those that are necessary as entry points or for its functioning.

For any individual package, there’s a much more straightforwardly consumable listing of preferred/deprecated info at e.g. https://hackage.haskell.org/package/bytestring/preferred.json

Yes, that tar data type can be incrementally synced between servers - is what I also was wondering. But to incrementally sync something like rsync/zsync support is needed. Or maybe/probably tar can do it by itself (it is at least what it was designed for in the first place, for magnetic tapes).

Thank you, it was pleasant having a dialog with you.

One can incrementally fetch the tar with http and range requests directly – that’s what the codebase inside cabal does, if you look through it!