A new future for cryptography in Haskell

Right, it’s ISC which is equivalent to BSD. Sorry for the confusion, I must’ve looked at the wrong thing.

Indeed, I’m not sure why the author did this. Botan package (at least on Arch Linux) supplies a .pc file, so using external dependency should be easy with pkgconfig-depends.

After diving into this yesterday, this is my reaction to many things regarding this library. I did not accomplish much of anything aside from gathering information, but here is what I have discovered so far:

  • Some things like JSON / Text / Vector aren’t re-exports like I initially thought - they are all custom types (yikes!)

  • Most uses of Text are used like CB.fromText . T.toText or to derive their custom Z.Data.Text.Print class, so I don’t think pose a large problem per se and I think it could be swapped out for Data.Text.

  • All uses of JSON are just used for deriving instances of it, so they can probably be ignored entirely.

  • All uses of Vector are for Z.Data.Vector.Bytes which is actually a thin wrapper around PrimArray Word8 - we could convert this to use ByteArrayAccess / ByteArray constraints possibly, but we’re going to have to deal with it. Ditto for another type CBytes, which is a direct newtype around PrimArray Word8.

  • The Z.Data.CEBytes is intended for constant-time equality checks, but it is also a newtype over PrimArray Word8 - this might be replaceable with a newtype ConstEq ba = ConstEq ba with instance (ByteArrayAccess ba) => Eq ba where (==) = constEq or something. We’re going to have to deal with it same as Z.Data.Vector.Bytes

  • Z.IO.BIO stuff is unnecessary / experimental and can probably be jettisoned / ignored entirely.

  • Some stuff in Z.Foreign gets a lot of use eg withPrimVectorUnsafe so we’ll have to be careful bringing in or replacing it.

There’s also some questions of how far do we want to go, beyond our initial focus on stripping out the Z-* dependencies. We can conceptually look at this as two libraries - there is the low-level botan bindings in the Z.Botan.FFI file, and the higher-level crypto interfaces in Z.Crypto.*. We should think about what we want the crypto interfaces to look like.

  • A prime example of this is how this library uses large enumerations like the HashType and CipherType data types to represent all algorithms as one type. However, this is entirely the choice of the Z library author, and is not required by botan itself. It is also awkward and limits type-level programming compared to functional dependencies with a witness / proxy a la cryptonite, or associated type / data families that I’ve written in the past.

I’m sure there will be more, but these are my major thoughts after getting it to build, and rifling through everything a bit yesterday and this morning.

I’ll keep this thread updated as needed.
Edited to differentiate CBytes and CEBytes

4 Likes

Yeah, that’s a good question. Considering that crypton seems to have been well-received and a few packages already switched to it, a good option would be to have a low-level package for botan bindings and then use this package to switch crypton to use botan for everything it does over time.

This approach has significant advantage over coming up with a new library since tons of existing dependencies would directly benefit without having to switch to a new API and improvements could be incremental, i.e. make crypton use AES from botan, then use hash functions from botan etc.

botan supports a ton of cryptographic primitives, so presumably crypton could use it for everything it does. This needs double checking though.

And since botan also implements TLS protocol, presumably the tls package could also be made to use it in the end, which I think would put the Haskell ecosystem in a comfortable place cryptography-wise.

That’s obviously a significant undertaking, but perhaps Haskell Foundation could help here @david-christiansen?

2 Likes

Perhaps! It would be based on a cost-benefit analysis, of course. This is the kind of thing that our proposals process is really built for - to help explore the actual costs and benefits of this kind of thing, and enumerate just what kind of help is needed.

2 Likes

An update on some brief hackery: On closer inspection, I realized I never had actually gotten it to build properly, but I’m not sure it really matters because I discovered many more issues while fiddling around with the Z-Botan repo knocking out pieces to get it to work.

  1. It uses an older version Botan 2, not Botan 3.

  2. Z-Botan is itself woefully out of date, even for its own Z- dependencies

    • It has strict primitive (>=0.7 && <0.8) and time (>=1.11 && <1.12) dependencies
    • It needs Z-Data 8.6.1 and Z-IO 8.1.1, which are themselves outdated
    • It needs ghc865 to get versions aligned.
  3. Botan setup / configuration is done via configure.py script

    • This script is 3400+ lines of python.
    • Z-Botan calls this during the custom Setup.hs phase (and is a tad inscrutable because there are no comments).
    • args = configureFile:"--amalgamation":"--disable-shared":hostFlag is where we can inject the arguments to control botan’s build
    • Building The Library — Botan

These issues are severe enough that, between the need to strip out the Z-IO and Z-Data plus the need to update to Botan 3, it might be better to start over with new bindings in a new project rather than try to cajole Z-Botan back to life.

My next move is probably to see if I can get Botan 3 building solo, then try and integrate it into a fresh project, and then pick a subset of the bindings (eg, hashing) and try to implement them using Z-Botan as a reference. This should tell us a lot about the viability of this trajectory, without having to update dozens of modules with hundreds of bindings all at the same time. If it goes well, we can continue pulling bindings over from Z-Botan until we are done.

I am of course open to suggestions, but will otherwise continue on this course as time permits.

4 Likes

3 posts were split to a new topic: Botan bindings devlog