Botan bindings devlog

Great work Leo and collaborators! It’s exciting to see this work being done.

2 Likes

Props to @BurningWitness and @ocramz for the PRs, and of course to @ApothecaLabs for the continued effort :tada:

1 Like

That’s a pretty great update. I maintain a small (nearly-irrelevant) package which does FFI bindings. I’m going to have to play with CApiFFI and those CTYPE annotations, and try and modernise it.

1 Like

Holiday Update: Fixing botan-low for the new CAPI changes

The last update was a good one, but it broke a few things in botan-low - hence why I’ve been working off of the CAPI-experiments branch. This update is about as big, and once again touches every module in a library - this time botan-low. It’s a bit late, so I’ll keep this short and sweet.

Change log / greatest hits

  • botan-low now uses the newer better data types and their newtype constructors
  • The mkBindings generator function has been moved to botan-low and expanded
  • Initializers now use generated createObject functions
  • Destructors are now generated
  • Functions now use ConstPtr where appropriate
  • FooCtx-style objects have been renamed Foo
  • withFooPtr-style functions have been renamed withFoo
  • mkFoo functions in Botan.Low.Make are being re-worked
  • Suite of more consistent marshalling functions are being developed
  • “Fixed” unit tests to the point of compiling if the XFFI test modules are deleted and removed from the cabal file (havent actually scrutinized the results yet)
  • botan is still broken :upside_down_face:

Whew! That covered a lot and I’m still testing everything (in fact, the unit tests are still broken, but that’s because hspec-discover doesn’t respect our if flag(XFFI) and tries to include modules that won’t exist), so this update is still living on the CAPI-experiments branch for the time being.

There’s more work to do on botan-low but I’m aiming to get it release-stable like how botan-bindings effectively is - that’s all for now! :grin:


This update has been pushed to the CAPI-experiments branch branch.

5 Likes

New Year’s First Update - Merging CAPI-experiments back into main

Its the first update of the new year! :partying_face: And I bring some good things!

The first big point is the CAPI-experiments branch has become stable enough that I have merged the changes back into main. That means all of the recent CAPI good-ness has now made it back in, effectively freezing botan-bindings minus any additive changes.

Second big point is that I’ve added a bunch of pattern constants for algorithm names to botan-bindings and botan-low to match the data types in botan.This makes the lower-level libraries a tad more useful on their own, and will help guard against stringly-typed errors.

As an example, we might have the following in botan-bindings:

pattern BOTAN_BLOCK_CIPHER_128_AES_128
    :: (Eq a, IsString a) => a
pattern BOTAN_BLOCK_CIPHER_128_AES_128      = "AES-128"

pattern BOTAN_AEAD_CHACHA20POLY1305
    ::  (Eq a, IsString a) => a
pattern BOTAN_AEAD_CHACHA20POLY1305 = "ChaCha20Poly1305"

pattern BOTAN_AEAD_MODE_GCM
    ::  (Eq a, IsString a) => a
pattern BOTAN_AEAD_MODE_GCM         = "GCM"

But express it in botan-low more succinctly:

type BlockCipher128Name = ByteString

pattern AES128
    :: BlockCipher128Name
pattern AES128          = BOTAN_BLOCK_CIPHER_128_AES_128

type AEADName = ByteString

chaCha20Poly1305 :: AEADName
chaCha20Poly1305 = BOTAN_AEAD_CHACHA20POLY1305

gcmMode :: BlockCipher128Name -> AEADName
gcmMode bc = bc // BOTAN_AEAD_MODE_GCM

-- With *extended* algorithm parameters
gcmMode' :: BlockCipher128Name -> Int -> AEADName
gcmMode' bc tagSz = gcmMode bc /$ showBytes tagSz

-- Formatting helpers

infixr 6 //
(//) :: (IsString a, Semigroup a) => a -> a -> a
a // b = a <> "/" <> b

infixr 0 /$
(/$) :: (IsString a, Semigroup a) => a -> a -> a
a /$ b = a <> "(" <> b <> ")"

And even further, give it a proper data type in botan:

data AES
    = AES128
    ...

data BlockCipher128
    = AES AES
    ...

data BlockCipher
    = BlockCipher128 BlockCipher128
    ...

-- NOTE: Current datatype in `botan` does differ slightly
data AEAD
    = ChaCha20Poly1305
    | GCM BlockCipher128 (Maybe Int)
    ...

I’ve tried to strike a balance between safety and exposing functionality by providing constants for algorithm families and modes, without making a constant for every combination. I’ve yet to make the botan data types now use the botan-low constants and functions properly, but I’ll be working on that shortly.

NOTE: I haven’t used Ptr "TheConst\0"# because most algorithms have parameters, and we need to glue the parts together for compound algorithms, and that would be thousands of constants if we unrolled that all.

Third big point is that I’ve split the unit tests up into individual targets so that its easier to see what’s passing and failing - no longer will cipher unit tests spam everything with test failures! This is great because I can just test everything with:

cabal test botan-low

Or I can test specific things via

cabal test botan-low-hash-tests

I’ve also fixed the unit tests - some were broken by being split into separate targets, but I had to get the pattern constants in first as preparation. However, this also means that tests can no longer use hspec-discover - 1) because it doesn’t respect flags and 2) each file is getting its own test-suite anyway - alas!

Unit tests also now (somewhat) use the new algorithm name constants, though I’m still working on that - I’ve tried to be careful, but I’ll be scrutinizing these changes quite severely as I go over them again a few times in coming days.

Last big point is that I’ve also gotten botan building again with all of the changes in the last month or so that broke it. It had been a while since I last poked around the high-level library since so much attention has gone towards botan-bindings, botan-low, and botan-upstream, but I was pleasantly surprised by how I had left things.

I still have a few things to do before I can declare botan-low stable enough for versioning like botan-bindings, however - I need to go over everything with a fine-toothed comb:

  • Find and fix any missing constants
  • Implement the view-bin and view-str functions
  • Clean up unit tests especially algorithm test suite generators
    • Consolidate the Make and Remake files
  • The mystery of why certain unit tests started passing
    • Probably was the cipher tests because of the simplified set of algorithms that we’re testing against (we’ll leave extensive testing to botan)
  • Move some of the non-canonical functions from botan-low to botan
    • I’m specifically thinking of the cipher online-vs-offline functions and the hash convenience functions
  • A little bit of nomenclature standardization
    • BOTAN_BINDINGS_FOO_VAL vs LowFooVal vs (high-level) val

But aside from those things, botan-low has arrived near some critical level of stability, much like botan-bindings. Now that the lower libraries aren’t going to be changing as much, and the roles of the various library levels are more clear, I’m effectively playing ‘fill in the blanks’, and the majority of my focus can be on getting the high-level bindings in botan to that sort of state too, now.

With all of the preparation we’ve made, I hope that it should go fairly quick. :zap:


All-in-all, I’m really relieved to be getting all of this merged back into main. There’s a lot of modules to juggle, so it’s nice when it all aligns nicely and everything builds as it should - botan-bindings, botan-low, botan, and all of the botan-low-*-tests. :relieved:

These changes have been merged to main and pushed to the github repo.

That’s it for now, though you should check back soon for the incipient monthly status report!

6 Likes

Update: Work on botan recommences!

Now that the lower level libraries are more or less stable, I’ve had the pleasure of being able to focus back on the high-level botan library, with its idiomatic and pure-as-possible interface. I have made several large strides in the background while writing the latest monthly update, since I am finally getting to work a bunch of fun stuff that had to be put off until now. :relieved:

First off, Botan.RNG has been strongly refined, and use of random generators has been ensmoothened with the introduction of the MonadRandomIO monad and the RandomT monad transformer (which is an instance of MonadRandomIO). Getting this right is essential - because of the many functions that rely on it, it affects the ergonomics of the rest of the library.

Now, there are two ways of accessing randomness:

  • Directly using an RNG context
  • Implicit access to an RNG context using MonadRandomIO

Direct usage is the old way, and it pretty much looks like this:

main = do
    rng <- newRNG Autoseeded
    addEntropyRNG "Fee fi fo fum!" rng
    reseedRNG 32 rng
    x <- getRandomBytesRNG 12 rng
    print x

Implicit access to an RNG through MonadRandomIO is the new way. IO is itself a convenient instance of MonadRandomIO that
uses the systemRNG:

main = do
    addEntropy "Fee fi fo fum!"
    x <- getRandomBytes 12
    print x

It is also possible to use the RandomT transformer or RandomIO monad (currently a typealias for ReaderT RNG and ReaderT RNG IO respectively):

main = do
    rng <- newRNG Autoseeded
    flip runRandomIO rng $ do
        addEntropy "Fee fi fo fum!"
        x <- getRandomBytes 12
        liftIO $ print x

I’m not exactly sure how MonadRandomIO will change under the hood (RandomT is probably going to become a newtype at least) but basically, any functions that need random values, or that take an RNG as an argument, can now be MonadRandomIO instead - this should make it really easy to generate keys and whatnot, and reduce the number of arguments in general.

For example, with MonadRandomIO, bcryptGenerate now only takes 2 parameters!

main = do
    dg <- bcryptGenerate "Fee fi fo fum!" Fast
    print dg

Secondly, all of the algorithm name constants and functions have been completely vertically integrated, from botan-bindings to botan-low to botan. This makes using a given algorithm much easier and more consistent no matter what level of library you are using.

Thirdly I’ve created a KeySpec class for a better representation of what keys sizes are available for a given primitive. Its similar to the KeySpecifier from crypton. I’m probably going to rename it.

Fourthly, in the Botan C FFI, you have to initialize a cipher or hash or mac context in order to query its sizes, but this data is constant and initializing a context is not free, so I’ve been writing pure / static versions of key spec / block size / tag length / other size query functions so that I can just say:

let bsz = hashBlockSize MD5

Instead of:

bsz <- do
    ctx <- Low.hashInit (hashName MD5)
    Low.hashBlockSize ctx

This is just overall a much better experience, as we can get algorithm parameters by referencing the algorithm itself instead of referencing an initialized context. It is really helpful for generating keys even if you aren’t looking to use them immediately.

Lastly, CI is working - at least for MacOS. I need to test manual Botan C++ installation for Linux, as I’ve determined that botan 3.x packages haven’t been published yet, but that seems to be the only issue at the moment!


As always, these changes have been pushed to the repo :partying_face:

4 Likes

Lastly, CI is working - at least for MacOS. I need to test manual Botan C++ installation for Linux, as I’ve determined that botan 3.x packages haven’t been published yet, but that seems to be the only issue at the moment!

Which Linux distro were you looking at? It looks like Gentoo’s shipping botan-3.2.0 right now, at least for amd64 and ppc/ppc64

@tsuraan

I was using the latest Ubuntu (23.10), and it turns out that that’s the culprit - Botan 3.x+ is too recent to be in the 22.04 LTS and apparently hasn’t made it into 23.10 yet. So Ubuntu needs to follow manual installation for now; in other words, the CI can be fixed rather simply and it’s on my todo list!

Update: botan lurches to life!

Hello everyone, I’m sorry that I’m a bit late. I’ve been doing a lot of thinking, and it can be a bit difficult to post while that is still underway - I am the epitome of ‘think before you speak’, taken to pathalogical extremes :upside_down_face:

The botan-low interface works, so if you don’t mind putting up with its clunky low-level interface, you can have at it already to get stuff working - but the whole point of the higher-level libraries is a better interface, so its worth thinking heavily about.

While stewing about, I’ve been able to do a lot of the rather mechanical but still useful work in getting the high-level botan library up and running - regardless of what interface I eventually choose to express, there are certain things to be done along the way, so I might as well get them out of the way.

As a result, if you’ve gotten used to anything in the highest-level botan library, I’m afraid its changed heavily. Some of the modules that existed prior have been reworked for consistency, and the others will be too soon enough.

Most of the modules are nominally completed:

  • Botan.RNG
  • Botan.Bcrypt
  • Botan.BlockCipher
  • Botan.Cipher minus lazy / online processing
  • Botan.Hash
  • Botan.KeySpec
  • Botan.KDF
  • Botan.MAC
  • Botan.PubKey
  • Botan.PubKey.Load
  • Botan.PubKey.Encrypt
  • Botan.PubKey.Decrypt
  • Botan.PubKey.Sign
  • Botan.PubKey.Verify
  • Botan.SRP6

Other modules are underway as I investigate the various cruft and minutia that’s piled up. I’ve found some gnarly issues with PubKey.Sign since not every pubkey algo works with every signing algo, and I’m seeing some odd things like PEM signatures only working in unit tests and DER signatures working only in GHCI. I’ll slog through it eventually, but for now, try not to stray from the established path, for here be dragons!


This represents our first stab at an idiomatic interface, so it isn’t great, but it isn’t terrible. I’ve mostly made things pure where I could, MonadIO or MonadRandomIO where I couldn’t. I’ve tried to give everything proper types as it helps me smooth the course and see what idioms need to be applied in a given case, but now that it’s here, I know that I can do better. There’s still a lot of other things to get through, but I’ve got a whole checklist and templates to speed all of it up.

If you want to take a look at the sort of increased ergonomics I have in mind, check out the source code to Botan.SRP6 - the exposed interface is mostly a straightforward translation of the botan-low interface, but if you scroll down further, you can see I’ve been building server and client session logic to abstract away a lot of the fiddly bits - I want to do that for all modules, as appropriate!

This of course leaves us with a lot of decisions to make - those pesky things that I’ve been thinking about - so by no means is anything in botan considered stable yet. We have a stable low-level library which works functionally, and now we are trying to get the ergonomics right:

  • Decision: FooType and Foo vs Foo and MutableFoo

    • Foo type and MutableFoo context, vs FooType type and Foo context
    • RNG is the exception? Why? (Because the context is not mutating, it is non-deterministic)
    • Current decision: Foo type and MutableFoo context; reasoning: the highest-level interface gets the simplest names
  • Decision:

    • Flatten SHA / etc variants into larger types?
    • Or define function aliases? Eg sha3 = Cryptohash $ SHA_3 SHA3_512
    • Current decision: Undecided
    • The more I have to use it in its current state, the more it annoys me…
  • Decision:

    • Terminology: algorithms with default
    • Low uses foo for default / argless and foo' with an apostrophe for the function with args
    • Might switch to fooDefault for default, and foo
    • Current decision: Undecided
  • Decision:

    • Classes for things like HasKeySpec / KeySettable, HasBlockSize, etc
    • Current decision: No. It is probably best left for a higher-level classy library, can make botan instances.
  • Decision:

    • Use Enum + Bounded for algos?
    • Initial solution: ADT trees + functions (manual solution eugh)
    • Current decision: None, defaulting to all of the fooName functions…
  • Decision:

    • Designed for qualified or unqualified import?
    • Probably want unqualified import for high level, but what about Mutable?
  • Decision:

    • How to treat nonced MACs (GMAC and Poly1305) vs non-nonced MACs (deterministic to the key+text)?
    • We can apply a MonadRandomIO constraint and ignore the difference return the nonce too, but it is unnecessary for most MACs
    • Initial solution: data MAC = DeterminsticMAC DeterministicMAC | NonceMAC NonceMAC
    • Afterthought: setMACNonceIfNeeded :: (MonadRandomIO m) => MutableMAC -> m ()?
    • Current solution: Only GMAC is actually nonced (Poly1305 folds it into the key),
      so GMAC gets gmac-specific functions
    • Or we could do what nacl / saltine do and make Poly1305 a distinct OneTimeAuth / MAC
    • We could then pull out GMAC as non-deterministic MAC similarly
  • Decision:

    • FooSize vs FooLength
    • Eg, DigestSize vs DigestLength, BlockSize vs BlockLength
    • Initial solution: Undecided, but considering standardizing on ‘Size’
    • Currently standardizing on: Size for algorithm components, Length for (plain- and crypt-) texts
  • Decision:

    • Split MutableCipher into MutableEncipher and MutableDecipher?
    • Would be consistent with PK encrypt, sign
    • Would obviate CipherDirection type
    • Current decision: Keeping track of encrypt vs decrypt in the mutable context
  • Decision:

    • Create aNonceSpec data type? Or generalize to SizeSpec / SizeSpecifier
    • Current decision: Unconsidered, still using validNonceSize and defaultNonceSize functions
  • Decision:

    • Terminology: validKeySize vs defaultKeySize
    • Also for nonces
    • Also size vs length (size for elements, length for message / ciphertext?)
      • Size slightly implies (a somewhat) fixed value, whereas length is more instanced
    • Leaning towards defaultFooSize :: alg -> Int and validFooSize :: alg -> Int -> Bool
  • Decision:

    • Error handling
      • Some (mutable) functions are failable if not used in the proper order
      • Some (mutable) functions are failable because the specific algorithm lacks support
      • Some (user) errors are not fatal (eg, setFooKey with incorrect key length)
        • Should we catch the error and return a bool?
    • How do we express this?
    • Current status: Allowing exceptions to be thrown - are exceptions satisfactory here?
  • Decision:

    • Cipher (and other processing algorithms) need to conform to a consistent interface
      but the APIs have differences
    • Example: Nonces in ciphers as an argument to cipherStart vs (G)MAC set(G)MACNonce
      • Mostly affects only the mutable interface, in that they may have a required order
        or some other peculiarity that is only visible to the internals
      • The use of a nonce is associated with that specific instance of processing,
        and setNonce is more free-er than start(WithNonce) which has a specific order of use.
      • We could imagine the other cryptoprocesses as having a no-op start function
    • Arguably, the highest-level API should take all of the arguments, such that the implementation’s
      order of application doesn’t matter.
  • Decision:

    • Clarify ‘clear’ vs ‘reset’ consistently
    • Some algorithms only have ‘clear’, but others have a more limited ‘reset’ that preserves keys
  • Decision:

    • Push the higher MutableFoo terminology down to botan-low, eg (eg, setFooBar instead of fooSetBar)
    • Current status: Pondering, no need for the churn at the moment
  • Decision:

    • Collapse modules together?
      • HOTP + TOTP = OTP
      • No PubKey submodules?
    • Specificity vs ease of use
    • High-level libraries focus on ease of use, the low-level libraries are quite specific;
      there is benefit to collapsing these modules in botan
    • Addendum: Collapsed all the pubkey algorithm-specific module down to Botan.PubKey.Load
  • Decision:

    • What to call PubKey?
    • The module name references Public Key Cryptosystems, of which public keys are a component.
    • Possibly rename it PKC or CryptoSystem (or should CryptoSystem be the generalized concept, not just public keys?)
    • With the PKC namespace, it may make a bit more sense to move PrivKey and PubKey under it a la:
      • Botan.PKC.PubKey
      • Botan.PKC.PrivKey
      • Botan.PKC.Encrypt
      • Botan.PKC...
  • Decision:

    • Use of MP in APIs?
    • Can take Integer instead, eschew Botan.MPI entirely
    • Result: Yeah, definitely elide MP entirely in favor of Integer
  • Decision:

    • Elide pointless accessors (such as length queries that have already been used in Botan.Low)
  • Decision:

    • How to represent PubKey types that are only used for specific operations
    • That PubKeys require an algo and params causes issues the current setup for pk operations
    • Example: What signing algos are usable is dependent on what pubkey is use
  • Decision:

    • How to deal with the gnarly algorithm hierarchies?
    • Break algorithms up into individual data type, and use classes?
      • Currently we end up with stuff like: pkSign rsa (EMSA $ EMSA4 (Cryptohash $ SHA2 SHA512) Nothing) ...
        as opposed to something like pkSign rsa (EMSA4 SHA512)
    • Would require cryptographic classes (crypto-schemes) and botan instances
    • Decision: not yet
    • Much longer scope
  • Decision:

    • Botan’s unified data types are a mixed bag - convenience, but problems too
    • functions with keys can fail if an incorrect key is used, eg:
      • mac :: MAC -> MACKey -> ByteString -> Maybe MACDigest
    • algorithm-specific keys can be assumed valid and thus we can get rid of the Maybe:
      • sha512hmac :: SHA12MACKey -> ByteString -> MACDigest
    • Right now we use exceptions if the key is incorrectly sized, should we keep doing that?
    • Or should we convert all of these exceptions to Maybe?
    • Or should we expose algorithm-specific functions?
    • Current result: Mostly still throwing exceptions

So yeah, a lot. It is taking shape though, and just re-reading the list of decisions-to-be-made as a whole helps give me a direction though, and you can of course provide any feedback you might have.


As always, the repo has been updated.

3 Likes

Wow, there’s lots to still be done it seems :sweat_smile:

A few things that pop into my head after reading this and looking through the SRP6 and other modules:

  • Please use newtype for any types that will be function arguments. The worst thing to happen is switching up ByteString arguments and not knowing until runtime that you’re using the password as the salt, or some other mixup.
  • Exporting SRP6Salt(..) does nothing if it’s a type synonym, right? Might even produce some warnings? Also will export any constructors if you change it to a newtype, which might not be desireable.
  • As a general rule, I’d probably not export any constructors of types that shouldn’t be directly fiddled with, as I expect that will be the case in a lot of modules for crypto functionality.
  • I’m not very knowledgeable on all the different permutations/combinations of hashes and algorithms, but having separate functions for separate combinations would make for a more pleasant API, IMHO.
    i.e. make a sha512hmac if that provides an API where it is guaranteed to work. This way, you’d also have sections in the documentation for every hash/algorithm, so you can more accurately provide the caveats and other side-notes that come with each hash/algorithm combination.
  • I see here that the HMAC constructor takes a Hash, but that the comment says it should never be a Checksum, so should it just take a CryptoHash?
  • Is SipHash 2 4 the only valid SipHash? Might make more sense to just name the constructor SipHash24 then?
  • I personally would like that these functions never throw exceptions, but please return Maybe or Either if I have to be ready for them to fail. Though setting up the API so that non-failing combinations are easy and type-safe should be a priority, I feel.

I have lots of opinions on making good developer experiences with nice APIs, but I just don’t know what all the options are :sweat_smile:
I’d be happy to go through the botan library with you once it’s more “done” to help with nailing down the API and/or to do some brainstorming in general.

2 Likes

Oh there’s always more work to be done, especially now that we have some choice regarding how we should implement things. I’m just glad to have someone else following along as a sanity check, and I think I agree with most if not all of your points - you’ve highlighted a few items that are burning a hole in my todo list.

Decide a few of the right things and it all falls into place soon enough, but it is easier to decide when others share their opinion. Thanks :grinning_face_with_smiling_eyes:

2 Likes

A Classy Update

Decisions are like dominos, knock a few over and the rest come tumbling down.

After some feedback, and a lot of playtesting, it has become clear that the algorithm ADT trees are terribly unwieldy, and not at all the sort of interface that I’d envisioned when setting out on this project. In response, I’ve come to a decision:

ADTs were better than raw strings or constant patterns, but now they are getting in the way - expressions like AEAD $ GCM (BlockCipher128 AES_256) 16 and Cryptohash $ SHA3 $ SHA3_512 are awfully frustrating to read and use. I’m (eventually) axing the algorithm ADTs, in favor of a better approach.

I was initially following z-botan's lead which was helpful at first - however, we are not beholden to that format. Additionally, with the need to add support for BOTAN_HAS_ conditional defines for individual algorithms, the ADT approach makes less and less sense.

Instead, I am proposing a classier interface that uses data families to ensure type isolation and inference. Originally, I was planning on working on this interface as a separate cryptography library (originally called crypto-schemes but that sounds too nefarious), and then making botan conform to it in a separate cryptography-botan library. However, at this point it seems more sensible to just skip the extra step of a separate library, and just implement the conformances in botan itself, while developing cryptography inside of botan to be extracted as a separate library later.

As a result, this update is focused heavily on these new typeclasses:

  • Botan.BlockCipher.Class
  • Botan.Cipher.Class
  • Botan.Hash.Class
  • Botan.MAC.Class
  • Botan.OneTimeAuth.Class

The new classes are something like:


data family SecretKey alg
data family Ciphertext alg

class BlockCipher bc where
    blockCipherEncrypt :: SecretKey bc -> ByteString -> Maybe (Ciphertext bc)
    blockCipherDecrypt :: SecretKey bc -> Ciphertext bc -> Maybe ByteString

data family Nonce alg

class Cipher c where
    cipherEncrypt :: SecretKey c -> Nonce c -> ByteString -> Ciphertext c
    cipherDecrypt :: SecretKey c -> Nonce c -> Ciphertext c -> Maybe ByteString

data family Digest alg

class Hash h where
    hash :: ByteString -> Digest h

data family Auth alg

class MAC m where
    auth :: SecretKey m -> ByteString -> Auth m

data family OneTimeAuth alg

class OTA ota where
    oneTimeAuth :: SecretKey ota -> Nonce ota -> ByteString -> OneTimeAuth ota

This isn’t exactly how they are (still being) implemented, but its an accurate enough representation. Other algorithms and modules having multiple data families are slightly more complicated to write, but are coming soon, pending some more data family work. I have tried to create a proof-of-class implementations of at least one algorithm per class type, to show that it functions as intended:

  • Botan.BlockCipher.AES
  • Botan.Cipher.ChaCha20Poly1305
  • Botan.Hash.SHA3
  • Botan.MAC.CMAC
  • Botan.OneTimeAuth.Poly1305

A gold-star example of a relatively finished algorithm module (and the effectiveness of the approach) would be Botan.Hash.SHA3, which we can explore:

import Botan.Hash.SHA3

It has per-algorithm -level functions:

sha3_512 "Fee fi fo fum!"
-- 03a240a2...

It also has algorithm-family -level functions that can use TypeApplications to select specific variants:

sha3 @512 "Fee fi fo fum!"
-- This produces the same digest as before

Explicit typing also works:

sha3 "Fee fi fo fum!" :: SHA3Digest 512 -- Or SHA3_512Digest

These functions are implemented via a more generic, classy Hash interface which uses the Digest data family to ensure that different algorithms and variants have different types while still being inferred properly.

import Botan.Hash.Class
:i Hash
-- class Hash h where
-- hash :: ByteString -> Digest h
:i Digest
-- data family Digest h

We can allow our hash algorithm to be parametric using hash, while still using type applications or inference to select our specific algorithm:

-- Once more at the class-level
hash @(SHA3 512) "Fee fi fo fum!"
-- Once more with explicit typing
hash "Fee fi fo fum!" :: Digest (SHA3 512)

The other classes work for at least one algorithm, but at the moment it might require a bit of unsafeCoerce to turn bytestrings into keys, while I get better support for that sort of thing underway.

Here’s CMAC AES128:

import Botan.MAC.Class
import Botan.MAC.CMAC
import Botan.BlockCipher.AES
import Botan.RNG
import Unsafe.Coerce
k <- getRandomBytes 16
mac @(CMAC AES128) (unsafeCoerce k) "Fee fi fo fum!"
-- 7989fb40105646e975311785efae3048

And here’s the ChaCha20Poly1305 cipher

import Botan.RNG
import Botan.Cipher.Class
import Botan.Cipher.ChaCha20Poly1305
import Unsafe.Coerce
k <- getRandomBytes 32
n <- getRandomBytes 12
ct = cipherEncrypt @ChaCha20Poly1305 (unsafeCoerce k) (unsafeCoerce n) "Fee fi fo fum!"
-- 2b0c0e4e332b4214d3c939b0d1af90a89167d914df538f6cdc364371dd8d
pt = cipherDecrypt @ChaCha20Poly1305 (unsafeCoerce k) (unsafeCoerce n) ct
-- Just "Fee fi fo fum!"

Other classes and data families will be quite similar. Notably, we avoid passing around an explicit algorithm witness / proxy, but remain type-injective due to the data families, and only one call site is required for inference to work. It is also clear that this approach will be very amenable to TemplateHaskell in the future. And don’t forget, eventually, these classes will be pulled out into a backend-agnostic cryptography library.

I’m still currently working on some support classes for data families in Botan.Types.Class, such as Encodable and SecretKeyGen and NonceGen which have not yet been applied to the aforementioned cryptography classes but will provide the necessary support to make writing data family instances much easier. If you’ve used saltine or cryptonite, you’ll recognize their influence.

I would like some feedback from the community on this - it does delay publishing to hackage as well as writing tutorials, as things still shift around a bit.


As always, this has been pushed to the repo.

1 Like

I’m wondering if it’d be an ok API if you’d have the Enums in their horribly wordy state, but then:

  • make newtypes for every section that only accepts parts of certain algos/etc
  • don’t export any way to create that newtype
  • EXCEPT for a group of pattern synonyms that contain all the valid ones; and
  • have a {-# COMPLETE #-} pragma to tell GHC the provided patterns are all that are valid :thinking:
1 Like

Perhaps ‘axing’ is a tad strong - lets say, relegated to the status of ‘prefer the other interface’ along side the mutable interface, in favor of this (hopefully) better new approach.

a group of pattern synonyms that contain all the valid ones

One of the difficulty of creating pattern synonyms for valid algorithm combinations is that there’s literally thousands of them - I’d have to rely on TemplateHaskell in order to make that feasible. I first ran into the issue with unit testing when the sheer number of tests (one per combination) caused the unit tests to terminate without outputting the results - this is why there are so many individual unit test targets now :upside_down_face:

Note that although I can’t provide convenient aliases for every algorithm combination, I certainly can make sure to do that for all priority / best-in-class algorithms as I did in Botan.Hash.SHA3, which exports not just the family sha3 but also all of the individual SHA3 variants (sha3_224, sha3_256, sha3_384, sha3_512).

make newtypes for every section that only accepts parts of certain algos/etc

Things have already been improved beyond yesterday’s update. With the new data families, we are accomplishing exactly this :smiley:

For example, AES128SecretKey (formerly AES128CipherKey, but ) is now actually a newtype wrapper around a GSecretkey type from which I can automatically derive everything.

First, GSecretKey is itself a newtype wrapper around Bytestring:

newtype GSecretKey = MkGSecretKey { unGSecretKey :: ByteString }
    deriving newtype (Eq, Ord, Encodable)

Then I then wrap the algorithm-specific newtype around that, plus a pattern and function to hide the GSecretKey:

newtype instance SecretKey AES128 = MkAES128SecretKey GSecretKey
    deriving newtype (Eq, Ord, Show, Encodable)

pattern AES128SecretKey :: ByteString -> SecretKey AES128
pattern AES128SecretKey bytes = MkAES128SecretKey (MkGSecretKey bytes)

getAES128SecretKey :: SecretKey AES128 -> ByteString
getAES128SecretKey (AES128SecretKey bs) = bs

And now I can just say:

newSecretKey @AES128
-- 591b3de67f882893a11af874fbf40bdd

Of course, this won’t be just for SecretKey and BlockCipher - I’m doing the same with other algorithms and supporting data types such as nonces, ciphertext, mac codes, salts, wherever appropriate.

It’ll all be out in the next update - I’ve already made some mad big strides since yesterday’s update (those dominoes keep tumbling down, and I’ve been in a happy little coding groove for the last few days.

1 Like

Update: Families!

After a serious shakedown with the recent classy update, things have really settled out.

  • Botan.Types.Class now provides reusable data families SecretKey, Nonce, Digest, Ciphertext, LazyCiphertext for common cryptography components
  • Also provides Has[Foo] / Is[Foo] typeclasses and G[Foo] generic types for data families
  • KeySpec has been generalized to SizeSpecifier in Botan.Types.Class
  • Botan.Types.Class is now considered the gold standard (nominally ready for release / documentation / tutorials)
  • Modules must conform to Botan.Types.Class to be gold standard

These modules have been updated to the current gold standard:

  • Botan.BlockCipher.Class
  • Botan.BlockCipher.AES
  • Botan.Cipher.Class
  • Botan.Cipher.ChaCha20Poly1305

These typeclasses are still being worked on to integrate with Botan.Types.Class, but work at this point is mostly mechanical.

  • Botan.Hash.Class
    • Needs to be updated to use the Digest / HasDigest / GDigest in Botan.Types.Class
  • Botan.MAC.Class
    • Ditto SecretKey / HasSecretKey / GSecretKey
    • Ditto Digest / HasDigest / GDigest
  • Botan.OneTimeAuth.Class
    • Ditto SecretKey / HasSecretKey / GSecretKey
    • Ditto Nonce / HasNonce / GNonce
    • Ditto Digest / HasDigest / GDigest

These typeclasses still need to be developed and are considered critical for first release:

  • Botan.PubKey.Class
  • Botan.PubKey.Encrypt.Class
  • Botan.PubKey.Sign.Class

Other typeclasses still need to be developed, but are not considered critical for first release as their concrete interfaces currently suffice.

  • Password hashing (for Bcrypt, etc)
  • FPE
  • HOTP / TOTP
  • KDF
  • Key Wrapping
  • KeyAgreement / KeyExchange / PAKE (for SRP6, DHKE)
  • ZFEC

We’re delaying our hackage release candidate by a week, maybe two, but its worth it, I promise. Slow is smooth, and smooth is fast. I want this library to be smooth.

As always, this has been pushed to the repo .

2 Likes

It really feels like the kind of abstraction problem you’re trying to solve between having an abstract interface of crypto algorithms with different backends such as botan is a perfect fit for backpack… :school_satchel:

2 Likes

Someone’s already done that: raaz: Fast and type safe cryptography.

I’m actually surprised this didn’t make the prior art of the tech proposal.

2 Likes

@romes

It certainly helps to have every gear polished and lubricated, ready to fit into place…

@jaror

Oooh, I don’t know how I missed raaz, but I’mma have to give it the ol’ look-see-and-digest for a bit - I’m sure there’s things to learn from it. :smiley:


Small aside, providing different backends for implementing the same algorithm, and providing typeclasses for implementing similar but separate algorithms, are two distinct goals. They do work together quite nicely, however, as they are related.

Update: More gold-standard modules, imminent Hackage release

Hello, all! It is time for another update, but first:

I must apologize, for I have fallen prey to my tendency to keep my eyes on the horizon instead of on the finish line, and have ended up moving my own goalposts. Truth be told, botan-bindings and botan-low passed the finish line some time ago, and I really should have pulled the trigger on publishing them already. :grimacing:

This is where I really appreciate the community framework; weekly meetings with José have been helping me to keep my updates more regular, and also keep me on track and appraised of any concerns. Frankly, I don’t want to be a silo’d developer, and communication is an essential part of that.

As such, a minor course correction: I’m switching focus, and the goal of the next update is going to be the publishing of botan-bindings and botan-low to hackage as a candidate package (alongside the third monthly status report, of course, which is also now due! Egads!).

I am a bit anxious, never having published a package to hackage before. I look at the project and its very easy for me to forget what I’ve done, and only see what still needs to be done, or could be done - perfectionist’s disease. Its fine, it doesn’t have to be perfect, it just has to be sufficient for initial release, and it is. We can still keep working on it after we publish.

There’s a few things that must happen before publishing:

First, according to hackage requirements, I need someone to endorse me, so if a fine citizen or two could assist me in this, I would appreciate it.

Secondly, there’s the minor question of pkgconfig-depends vs extra-libraries. Right now, the stanzas in botan-bindings.cabal look like this:

library
    ...
    includes:
        botan/ffi.h
    if os(windows)
        extra-libraries: botan-3
    else
        pkgconfig-depends: botan-3 >= 3.0.0
    ...

This has been working so far, but I want to ensure is configured properly, since it is how we link to the Botan C++ library. There are more than a few different ways of handling this, spread over several discussions:

I’d like to handle it properly - if I’ve read the discussions correctly, we can crank the cabal version dependency up and improve it. I’d appreciate any suggestions here.

Thirdly, I need to go over botan-bindings and botan-low with a fine-toothed comb and do a few things like establish dependency version constraints to make the package acceptable to hackage. I’ll be on this for the next few days, and then hopefully upload it as a candidate on Monday.

So, time to get it done!


Now, back to this update.

  • The following modules have been completed to gold-standard, and conform to Botan.Types.Class:
    • Botan.BlockCipher
      • AES
      • ARIA
      • Blowfish
      • Camellia
      • CAST
      • DES
      • GOST
      • IDEA
      • Noekeon
      • SEED
      • Serpent
      • SHALCAL
      • SM4
      • Threefish
      • Twofish
  • I’ve added a temporary BlockCipher128 typeclass while a proper blocksize constraint is developed.
  • There have been improvements to error reporting, and the last exception message is now attached to any thrown Haskell exceptions.
  • I’ve laid out a Botan.Easy module that exposes a saltine-like interface of recommended algorithms

That’s all for now, it’s been pushed to the repo. We’ll be back on this stuff after we publish to hackage.

5 Likes

pkg-config

I strongly recommend pkgconfig-depends, if botan upstream publishes .pc files. The main reason is that it means botan developers are deciding what the link flags are, rather than all their library consumers making a best guess. It also means you’re declaring a dependency on a native package instead of linker flags (which are more of an implementation detail), which seems more like the correct level of abstraction.

I would hope that you’d be able to use pkg-config on Windows but I haven’t played in that space for many years. Doesn’t GHC come with some sort of MSYS setup for this sort of thing?

4 Likes

Yes, pkg-config works on windows (or at least it is not any worse than other options). See my experience report: Installing a library with C dependencies on Windows

1 Like