A Classy Update
Decisions are like dominos, knock a few over and the rest come tumbling down.
After some feedback, and a lot of playtesting, it has become clear that the algorithm ADT trees are terribly unwieldy, and not at all the sort of interface that I’d envisioned when setting out on this project. In response, I’ve come to a decision:
ADTs were better than raw strings or constant patterns, but now they are getting in the way - expressions like AEAD $ GCM (BlockCipher128 AES_256) 16
and Cryptohash $ SHA3 $ SHA3_512
are awfully frustrating to read and use. I’m (eventually) axing the algorithm ADTs, in favor of a better approach.
I was initially following z-botan
's lead which was helpful at first - however, we are not beholden to that format. Additionally, with the need to add support for BOTAN_HAS_
conditional defines for individual algorithms, the ADT approach makes less and less sense.
Instead, I am proposing a classier interface that uses data families to ensure type isolation and inference. Originally, I was planning on working on this interface as a separate cryptography
library (originally called crypto-schemes
but that sounds too nefarious), and then making botan
conform to it in a separate cryptography-botan
library. However, at this point it seems more sensible to just skip the extra step of a separate library, and just implement the conformances in botan
itself, while developing cryptography
inside of botan
to be extracted as a separate library later.
As a result, this update is focused heavily on these new typeclasses:
- Botan.BlockCipher.Class
- Botan.Cipher.Class
- Botan.Hash.Class
- Botan.MAC.Class
- Botan.OneTimeAuth.Class
The new classes are something like:
data family SecretKey alg
data family Ciphertext alg
class BlockCipher bc where
blockCipherEncrypt :: SecretKey bc -> ByteString -> Maybe (Ciphertext bc)
blockCipherDecrypt :: SecretKey bc -> Ciphertext bc -> Maybe ByteString
data family Nonce alg
class Cipher c where
cipherEncrypt :: SecretKey c -> Nonce c -> ByteString -> Ciphertext c
cipherDecrypt :: SecretKey c -> Nonce c -> Ciphertext c -> Maybe ByteString
data family Digest alg
class Hash h where
hash :: ByteString -> Digest h
data family Auth alg
class MAC m where
auth :: SecretKey m -> ByteString -> Auth m
data family OneTimeAuth alg
class OTA ota where
oneTimeAuth :: SecretKey ota -> Nonce ota -> ByteString -> OneTimeAuth ota
This isn’t exactly how they are (still being) implemented, but its an accurate enough representation. Other algorithms and modules having multiple data families are slightly more complicated to write, but are coming soon, pending some more data family work. I have tried to create a proof-of-class implementations of at least one algorithm per class type, to show that it functions as intended:
- Botan.BlockCipher.AES
- Botan.Cipher.ChaCha20Poly1305
- Botan.Hash.SHA3
- Botan.MAC.CMAC
- Botan.OneTimeAuth.Poly1305
A gold-star example of a relatively finished algorithm module (and the effectiveness of the approach) would be Botan.Hash.SHA3
, which we can explore:
import Botan.Hash.SHA3
It has per-algorithm -level functions:
sha3_512 "Fee fi fo fum!"
-- 03a240a2...
It also has algorithm-family -level functions that can use TypeApplications
to select specific variants:
sha3 @512 "Fee fi fo fum!"
-- This produces the same digest as before
Explicit typing also works:
sha3 "Fee fi fo fum!" :: SHA3Digest 512 -- Or SHA3_512Digest
These functions are implemented via a more generic, classy Hash
interface which uses the Digest
data family to ensure that different algorithms and variants have different types while still being inferred properly.
import Botan.Hash.Class
:i Hash
-- class Hash h where
-- hash :: ByteString -> Digest h
:i Digest
-- data family Digest h
We can allow our hash algorithm to be parametric using hash
, while still using type applications or inference to select our specific algorithm:
-- Once more at the class-level
hash @(SHA3 512) "Fee fi fo fum!"
-- Once more with explicit typing
hash "Fee fi fo fum!" :: Digest (SHA3 512)
The other classes work for at least one algorithm, but at the moment it might require a bit of unsafeCoerce
to turn bytestrings into keys, while I get better support for that sort of thing underway.
Here’s CMAC AES128
:
import Botan.MAC.Class
import Botan.MAC.CMAC
import Botan.BlockCipher.AES
import Botan.RNG
import Unsafe.Coerce
k <- getRandomBytes 16
mac @(CMAC AES128) (unsafeCoerce k) "Fee fi fo fum!"
-- 7989fb40105646e975311785efae3048
And here’s the ChaCha20Poly1305
cipher
import Botan.RNG
import Botan.Cipher.Class
import Botan.Cipher.ChaCha20Poly1305
import Unsafe.Coerce
k <- getRandomBytes 32
n <- getRandomBytes 12
ct = cipherEncrypt @ChaCha20Poly1305 (unsafeCoerce k) (unsafeCoerce n) "Fee fi fo fum!"
-- 2b0c0e4e332b4214d3c939b0d1af90a89167d914df538f6cdc364371dd8d
pt = cipherDecrypt @ChaCha20Poly1305 (unsafeCoerce k) (unsafeCoerce n) ct
-- Just "Fee fi fo fum!"
Other classes and data families will be quite similar. Notably, we avoid passing around an explicit algorithm witness / proxy, but remain type-injective due to the data families, and only one call site is required for inference to work. It is also clear that this approach will be very amenable to TemplateHaskell
in the future. And don’t forget, eventually, these classes will be pulled out into a backend-agnostic cryptography
library.
I’m still currently working on some support classes for data families in Botan.Types.Class
, such as Encodable
and SecretKeyGen
and NonceGen
which have not yet been applied to the aforementioned cryptography classes but will provide the necessary support to make writing data family instances much easier. If you’ve used saltine
or cryptonite
, you’ll recognize their influence.
I would like some feedback from the community on this - it does delay publishing to hackage as well as writing tutorials, as things still shift around a bit.
As always, this has been pushed to the repo.