Paragon Haskell Large Codebases?

Just curious, but what are the best publicly-available large code bases in Haskell?

I.e, something that’s useful for people to learn from as examples of well-written, well-commented, and well-designed codebases of substantial size?

2 Likes

I’m curious how would you use them.

Large codebases tend to be messy, at least a bit! Large also means “developed through many years”, so you can see idioms shifting, different styles, different testing guidelines etc.

I now and then contribute to cabal, it is not super-big but enough to learn things by exploring it.

2 Likes

heh I would like to test graphex on a giant public codebase instead of just my work one :smirk:

But for graphex - the messier, the better!

3 Likes

I’m having difficulty going through a large Haskell codebase right now–I’m aiming to make sure everything is commented–and I’m a bit stalled because it has to support legacy features, and has certain restrictions on available libraries, and I’d think it’s baroque for that reason.

I want to find an easier, albeit still large, codebase, to get some practice parsing large Haskell codebases available on Github.

Gosh, there are so many! :slight_smile:

I’ll just list some of my favourites and regular go-to’s:

  1. flora-server: I’m not sure there is a better example of a servant-based website out there. I think the code is quite well-structured, and there a lots of interesting things to discover as you dive deeper into the implementation in various places.
  2. stack: whether or not you use it as a package manager, it was my first love as an extremely friendly ecosystem with very well designed code, for my money.
  3. hasura graphql-server (for a limited time only!): Hasura are moving away from Haskell, but still I think this codebase is interesting as a reference because it’s probably the largest Haskell codebase if you count by end-users? From that point of view I think, even though it’s being phased out, it’s very interesting to learn from; and I’m sure I’ve used it in the past as a reference for different things.
  4. hasktorch: I think this is interesting for just how sophisticated it is; I think they do a lot of complicated/interesting things with types, and (if I’m remembering correctly) have a very interesting way of integrating with the torch ecosystem in a way that some of the other Haskell ML projects did not.
  5. pandoc: A classic, and probably doen’t require much introduction. I find pandoc interesting for how in theory complicated it could be, for how easily you can work with it to do real transformations. I think it has a very nice data model.

This of course excludes pure libraries; but really, I think one of Haskell’s big strengths in how much can be learned by reading the source code of the libraries, and, in some sense, how small a lot of them manage to be; i.e. how well they compose together to allow us to build useful things with them :slight_smile: That is to say, while I’ve learned a lot from glancing at these projects above, I’ve learned at least as much from just looking at the codebases of the larger libraries and dev-tools themselves! :slight_smile:

6 Likes

I second that. And I’d take its codebase over cabal’s any day (purely talking about code hygiene), although there are some rough spots too.

You can also skim through IOGs large blockchain codebases, which are all public. E.g. the node or the wallet. Kadena is public too. Whatever you may think of blockchain, those are usually well designed codebases.

4 Likes

Does the Diagrams library count? I think it is a very neat library with which you can do a lot of stuff

5 Likes

I recently contributed to hackage-server, and found it to be a really nice experience! It’s very well organized + well architected

4 Likes

Thank you for the link! I think I am going to try this today right on cabal’s codebase!

2 Likes

Speaking of Hasura, are there any Haskellers proficient at Rust seeing obvious issues with moving graphql to a Rust idiom?

@Ambrose brought up in the Hasura thread that there is no guarantee of Graphql-Engine being successfully ported to Rust.

Via sloccount, the codebase is 200k lines Haskell and the estimated cost at 60k developer salaries is 8 million. Considering that a lot of Haskell development work is done with senior engineers, and even cutting costs based on Indian Haskellers, you’re still looking at about 16 million USD worth of developer hours.

Consider that Rust is between 150% to 300% the code length of Haskell, and that’s easily 48 million USD worth for the rewrite, and possibly 187.5 developer years.

Of course, this doesn’t say that rewriting GraphQL is impossible, and perhaps people more familiar with both Rust and Haskell might be able to clarify Hasura’s apparent project.


If this is in fact an expensively intractable problem, I’m guessing Hasura execs might end up kicking themselves for not trying, instead, a port to Linear Haskell, or paying Well-Typed or Tweag to help mature Haskell-Rust FFI and clone Sigma / SC’s Mu, except using Rust instead of C++.

If using the most current languages is a key Hasura value, wrapping Rust with Haskell would likely have given them the most avant-garde backend possible.