I’d consider it a big flaw in base that we don’t have “approved” HTTP libraries, but getting simple HTTP into base is a political quagmire.
But what’s worse is that the Haskell HTTP libraries aren’t at a sufficient level of maturity for someone to even push to add HTTP into base; for instance, Network.HTTP.Conduit will crap out if I have Linux shell variables pointing my HTTP proxy to socks.
From what I’ve seen of the group working on Cabal, the labor and organization seems to be at least adequate, and good things will come within the span of a year.
But that still leaves us with the library ecosystem at large, and the question is, “Can we do better?”
I’m not sure why that would need to be in base, especially given how you noticed that the maturity is sub-par.
On the other hand, core libraries are a bit of a fuzzy term and there’s not a very clear guideline on what’s supposed to be a core library and how to become one. The CLC readme has some words on that here.
My opinion is that it could be up to a separate project to define a set of battle tested libraries, document maintenance status and so on. That is mostly documentation work. No need to push this into base and burn CLC out.
haskell-tls is questionable: there’s no audit, we don’t have sufficient research that explores how vulnerable haskell GC/RTS is to side channel attacks, how memory secure, how it plays with laziness, etc. etc.
If you use HsOpenSSL instead, you’ll get into trouble distributing your binary, if you link dynamically. If you link statically, you’ll have a security hazard.
The solution is bindings (HsOpenSSL or botan or whatever), but the shipping/distribution issue will remain: dynamic linking causes portability issues, static linking causes security issues. That’s why shelling out to curl is the most portable and easiest solution, in fact.
The reason I think it’d be nice to have in base would be so that a fast call to ghci would be able to get a hold of network access, instead of playing Russian Roulette with cabal install --lib (it’ll blow up in your face sooner or later) or putting up with delayed build times with cabal repl / cabal scripts.
The String precedent is also very nice as well; people complain about String all the time, wish that we defaulted to Text, but String implies it’s fine to have a suboptimal code in base and have people be expected to use other things instead for serious work.
Go at least has network libs in standard libraries, and given that Haskell is used a lot for backend, it seems reasonable to not expect learners or trial users to get a library to make HTTP requests.
That said, I agree that existing network libs are, while very mature, not mature enough, and there’s likely to be serious technological challenges because of the need for secure cryptography.
I guess it’s something the Haskell community doesn’t have the resources to do at this point in time.
It depends, I’m trying to push for cabal init to do -threaded by default, but in theory, this is supposed to be GHC’s problem (there’s an open ticket there that’s stalled because the intended contributor had issues working with the testing framework).
I guess the Stack folks have it easy; base + GHC + cabal + GHCup are the core Haskell toolchain and everyone keeps on pushing problems to some other part of the toolchain.
In any case, I agree with most of what you’ve said.
If you link statically, you’ll have a security hazard.
Could you elaborate on this? I don’t think the “outdated C library” argument works because [from my extremely limited experience] network libraries are complex state machines, so a pure Haskell implementation could easily be way less secure.
I also wonder if as a Core Library it could be built into a binary and distributed that way, which would allow backwards-compatible security updates through simple minor version bumps on every version in a several-year time window.
Instead of having one point of security updates (e.g. linux distributors who very closely monitor CVEs and have tools and communication processes in place since decades) that are shipped easily and quickly, now it’s up to every application developer to monitor CVEs and roll out new binaries in case anything statically linked is busted.
That is significant work. We just started to have Haskell security advisories. We’re not even close to that level of trust from application developers.
Can you also elaborate on the portability issues of dynamic linking e.g. botan? Doesn’t shelling out to curl also cause portability issues (it might not be present or it might be slightly different on each platform)? Is there really such a big difference between the two?
When you link dynamically, you have a couple of options:
ship bindists for every distro you can imagine (built on said distro)… this is what HLS does to achieve high portability (and we can’t link HLS statically due to certain issues)
build on the oldest distro you can find (debian oldstable or something): that’s not good for security either, but works well for glibc compatibility
analyze all the library versions for every distro and build bindists for ABI versions (GHC has a matrix for example)
Either way, it usually increases the complexity of shipping binaries. Depending on how stable the ABI of the library is, old binaries may also stop working (that’s happening with old GHC bindists linked against old ncurses right now).
Since botan was mentioned - there is a full TLS implementation in botan, but it is not present in the FFI and it will require some C++ work before I can provide Haskell bindings to it. I am not sure what the timeline for that will be, but I am currently performing work of a similar nature to enable X509 certificate support.
First, I want to say that even if we had a great http library that everyone agreed was great, I would not want it in base, or even in a core lib shipped with ghc, because experience has taught us that even libraries everyone agrees on are best managed when decoupled, where possible.
Second I am hopeful that botan will provide a good tls api and make a reliable and easy to install cross-platform HTTP(s) library possible.
But finally, I want to point out that for many circumstances, having a good tls implementation and a good http implementation is not nearly enough to have a good http library for all users. Unfortunately, there’s a lot of other “conveniences” that people expect with http clients – that they follow redirects, that they handle different connection disruptions in a recoverable way, and especially that they can probe a bunch of system configuration to try to determine and make use of system proxy settings.
This is one reason why in my experience, the robustness of an http experience even when linking against libcurl is not nearly as nice as shelling out to curl – the executable does a very good job of dealing with all that stuff (as well as auth schemes, etc) in the way closest to what “normal users expect”.
I’d love for us to have a library that worked close to as well – but I just want to caution that the task is way larger than just the basic protocols would suggest.
(Let me add: for writing programs and servers running on systems I personally control, and where I can configure both code and the system itself, then existing libraries are fine, and perhaps preferable to shelling out – but for writing programs that can be distributed widely and be used by many people over many different system configurations, that’s when it turns out to be very painful)