Botan bindings devlog

Linear types are primarily intended for safety and unlifted types are partially intended for more safety, so I think it would count as a use and not an abuse.

3 Likes

A lot of the practical answers to this will lie in how we design our higher-level API, which will probably involve constructs akin to withStoredPrivateKey keyRef $ \ key -> ... , and we’ll have to build stores and atomic cryptographic operations to guide the user away from ever handling raw / exposed keys and such.

@cdepillabout and I keep the aforementioned concerns in mind when we add to or change anything in the password library.
The Password type explicitly has only one way of getting to the actual string of characters, which is the very obviously named unsafeShowPassword function.

I’m also very hesitant to allow a ToJSON instance for these exact reasons.

If anyone with crypto expertise would like to check out the library and advise on improvements, that’d also be very appreciated.

3 Likes

The repo has been updated with the following:

  • Completed low-level bindings for Botan.Error, Botan.Utility, Botan.Hash modules
  • Added missing Botan.Error error codes, botanErrorDescription function
  • Implemented missing Botan.Utility functions for memory scrubbing, base64 encoding and decoding
  • Implemented missing Botan.Hash functions for copying state, block size
  • Added flags for lower-case hex encoding
  • Removed memory dependency - see issue

Its creeping towards minimally functional, though low-level it may be - it will clean up nicely with higher-level bindings later. I think my next target may be the RNG module low-level bindings.

5 Likes

Another day, another module! This time, it’s Random number generators!

The repo has of course also been updated, with the following:

  • Added Botan.Random (the RNG interface)
    • The Random random number generator opaque type
    • The RandomType type for specifying the type of Random
    • randomInit and randomInitWith functions to create a Random
    • randomGet and systemRandomGet functions for getting n bytes of random.
    • randomReseed and randomReseedFromRandom reseeding functions
    • randomAddEntropy function for adding your own bytes of entropy
    • NOTE: botan_rng_init_custom function is not implemented. It looks complicated, for now.
  • Switched a few alloc to malloc because long-lived references were being freed*
  • Added pure hashWith convenience method

* If someone could sanity check the ForeignPtr initialization pattern I’m using, I would appreciate it.


If there are any particular modules that you would like me to tackle next, or if you would instead like me to focus a little more on developing some higher-level bindings to Hash / Random, please let me know. Otherwise, I’m following the Botan FFI header which means that message authentication codes aka MAC would be the next module. I’m open to listening to the community on this :slight_smile:

8 Likes

This is great progress! I’m really enjoying these posts!

4 Likes

The sun has risen again. You know what that means. The repo has been updated again.

  • Added Botan.Mac module for Message Authentication Codes
    • Construction of Botan mac type strings (eg, “HMAC(SHA-256)”, “SipHash(2,4)”) is badly-documented.
    • Some MACs require nonces, but this is also badly documented.
  • Refined handling of foreign pointers of botan objects / initializers / finalizers
    • Accurate types for opaque struct, pointer, and foreign pointer.
    • Uses FinalizerPtr type
    • Avoids an extra pointer indirection
    • Still fiddling with standardizing the nomenclature, only applied to Botan.Mac
      • Botan.Hash and Botan.Random still need retrofitting
    • Thanks to @glguy for help

Today’s efforts are going to be focused on updating the foreign pointer handling in Botan.Hash and Botan.Random, and getting a better sense of what needs to be standardized - eg, whether I will continue with straight 1:1 bindings, or instead do some encapsulation, considering:

  • Things like the hash / mac / rng type strings. macInitName "HMAC(SHA-256)" are both incredibly simple, yet incredibly awkward, and need constants at the very least.
  • Checking for algorithm support is also awkward (read, effectively non-existent)so I need to pick a method of handling it, and standardize that as well.

So basically a grab-bag of things that I want to take care of before we get enough modules to make them difficult.

Also, by line count, we are almost 1/4 of the way through the botan ffi header! Things do get denser towards the bottom, especially with x509 certs, but still… :partying_face:


@david-christiansen
I’m glad you are! Tough work is so much less of a slog when there are people rooting for you along the way, and the community’s response has really been keeping me going :slight_smile:

6 Likes

This is just a minor update, with no changes pushed as of yet. The stuff I’ve been working on is more process refinement, rather than a new module - though I am using the Cipher module as a testbed for this, the process refinement is the primary intent.

I’ve been doing some work to make implementations use a set of common initializer functions. Botan has a consistent enough structure that many separate structures share common patterns that are reusable, or at least minimal variation.

An example of what that looks like:

data CipherStruct
type CipherPtr = Ptr CipherStruct

newtype Cipher = MkCipher { getCipherForeignPtr :: ForeignPtr CipherStruct }

withCipherPtr :: Cipher -> (CipherPtr -> IO a) -> IO a
withCipherPtr = withForeignPtr . getCipherForeignPtr

foreign import ccall unsafe botan_cipher_init :: Ptr CipherPtr -> CString -> CipherFlags -> IO BotanErrorCode
foreign import ccall unsafe "&botan_cipher_destroy" botan_cipher_destroy :: FinalizerPtr CipherStruct
foreign import ccall unsafe botan_cipher_name :: CipherPtr -> Ptr CChar -> Ptr CSize -> IO BotanErrorCode
foreign import ccall unsafe botan_cipher_output_length :: CipherPtr -> CSize -> Ptr CSize -> IO BotanErrorCode

cipherInit = mkInit_name_flags MkCipher botan_cipher_init botan_cipher_destroy
cipherName = mkName withCipherPtr botan_cipher_name
cipherOutputLength = mkOutputLength_length withCipherPtr botan_cipher_output_length

If we peek under the hood, we can see that I have just taken the pattern that we were using and formalized it as a higher-order function by lifting the concretely-typed constructor / withPtr / funPtrs out as arguments (I am a fan of this pattern).

mkInit_name_flags constr init destroy name flags = do
    alloca $ \ outPtr -> do
        ByteString.useAsCString name $ \ namePtr -> do 
            throwBotanIfNegative_ $ init outPtr namePtr flags
        out <- peek outPtr
        foreignPtr <- newForeignPtr destroy out
        return $ constr foreignPtr

mkName withPtr name template = withPtr template $ \ templatePtr -> do
    alloca $ \ szPtr -> do
        bytes <- allocBytes 64 $ \ bytesPtr -> do
            throwBotanIfNegative_ $ name templatePtr bytesPtr szPtr
        sz <- peek szPtr
        return $ ByteString.copy $ ByteString.take (fromIntegral sz) bytes

mkOutputLength_length withPtr funPtr template length = withPtr template $ \ templatePtr -> do
    alloca $ \ szPtr -> do
        throwBotanIfNegative_ $ funPtr templatePtr (fromIntegral length) szPtr
        fromIntegral <$> peek szPtr

Using this pattern and declaring implementations this way should result in easier-to-read and more reusable code, and help minimize the potential for bugs. It will also significantly reduce the effort required to implement and maintain this library in the long run.

2 Likes

I am happy to present a major update to the bindings library, which has been pushed to the repo. Most of the work was focused on systemically improving what was already written so far, but as a result, I was also able to implement several more modules!

I have:

  • Created a Guide.md to help standardize initialization / destruction / handling of botan objects
  • Added convenience as- functions to Botan.Prelude for bytestring pointer handling
  • Improved botan error code handling for positive values
  • Improved Botan.Utility nomenclature
  • Created Botan.Make template object initialization / destruction / accessor functions to reduce boilterplate
    • This will greatly reduce the effort required to implement most functions
    • Only ‘unique’ functions now need be implemented manually
    • Minimizes attack / error surface, and improves the consistency of code
  • Implemented Botan.Cipher as test for using new Botan.Make functions
  • Re-implemented Botan.Hash, Botan.Mac using make-functions
  • Implemented Botan.BlockCipher, Botan.KDF, and Botan.Password modules
    • None of algorithms for Botan.KDF appear to be supported by my botan installation, so it is untested

The biggest thing is the Botan.Make module, which lets me handle most bindings very simply - it is now possible to implement most binding functions as such:

foreign import ccall unsafe botan_cipher_set_key :: CipherPtr -> Ptr Word8 -> CSize -> IO BotanErrorCode
cipherSetKey :: Cipher -> ByteString -> IO ()
cipherSetKey = mkSetBytesLen withCipherPtr botan_cipher_set_key

This makes implementing bindings vastly more easy - as a result of this, we are almost half-way through the Botan FFI header, with the MPI (Multiple Precision Integer aka BigInt) interface being the next target - it is necessary dependency of the public / private key modules.

5 Likes

This sounds like a great improvement. It’s always interesting to see how working with a problem for a while brings insights that weren’t apparent in the beginning simply from looking at a spec.

3 Likes

I took a peek at your code for cipherUpdate and saw that you’re using a ByteString as an intermediate buffer for botan_cipher_update. You then copy the content into a new ByteString for the return value. Wouldn’t it save a heap allocation to use allocaBytes for the intermediate buffer and then create a new ByteString from it for the return value?

1 Like

Another question: do you have plans to create a sum type for the cipher names etc. that are stringly-typed? Or is that a fool’s errand due to the maintenance overhead?

1 Like

Nothing quite substitutes for getting one’s hands dirty deep in the guts of a problem!

Indeed it would, and there are other places too for which there is unnecessary copying - it is mentioned in the TODO list in the README. For the moment, I have opted for safety over speed, but I plan on doing an optimization pass once I am satisfied that everything works.

cipherUpdate in particular is going to require a fair bit of work to get past low-level 1:1 bindings, because its behavior is rather dependent on the algorithm. While it plays nicely with ChaCha(20) because I can dump the entire plaintext at once, it gets rather nuanced for ciphers that have a required buffer granularity like AES-256/GCM, which means that I’m not just going to have to optimize cipherUpdate but also the chunking of the input by granularity as well :slight_smile:

It is a difficult problem that I have given a lot of thought. Part of the difficulty in building high-level cryptography bindings is that many of the low-level primitives do not present a uniform interface, and many of the things that appear to be ‘primitives’ are themselves actually compound constructs. It is only at the higher levels that everything looks the same.

Botan avoids this by making the name more than just a simple enumerable value - it is a specification format that it parses into a valid construct using the parameters supplied in the format. This isn’t immediately obvious, but it can be discerned by using the name functions, which will print out the fully expanded name. Some examples:

  • A Hash type often has several variants differing by digest length - SHA-3 is actually shorthand for SHA-3(512), but the format chooses default values.
  • The Mac type takes a hash algorithm as a parameter, and so HMAC(SHA-3) is valid, but so is HMAC(SHA-3(512))
  • It is a similar or worse situation for ciphers and other constructs - AES-256/CTR is CTR-BE(AES-256), but AES-256/GCM is really AES-256/GCM(16).
  • Then there’s things like AES-128/CBC/PKCS7
  • I have not yet found a listing / description of these name formats, but have found information by going through the source code and examples.
  • Even worse, not all mentioned algorithms are actually supported
  • The only way to find out if an algorithm format / combo is supported is to try and construct an object and have it fail with ‘Not implemented’ if it isnt

A simple sum type is almost certainly out of the question - I’m not even sure that all algorithms are enumerable, because some may take unbounded parameters (eg, I think maybe PBKDF(n)), others take multiple parameters (SipHash(2,4)) .

I’m not entirely certain what I’m going to do, but my current thought process is to make it easy to generate names from a primitive or compound spec, something like:

data SHA3Spec = SHA3Spec Int -- Or even SHA3_256 | SHA3_384 | SHA3_512
data HashSpec
    = SHA3 SHA3Spec
    | ...
data MacSpec
    = HMAC HashSpec
    | ...

I’ll be focusing a lot more on this sort of thing once I complete the 1:1 low-level bindings, but right now using any of it feels really crude - and I have a fair bit of cryptography knowledge! I can see it being nigh-impossible for anyone else to use, at the moment.

7 Likes

Wow, thanks for the detailed reply!

I noticed from your example that there was probably some kind of hierarchy in the cipher name. It’s a lot more complex than I realized, however. Your idea for using the full richness of ADTs to model the possibilities seems good, but I’m sure the devil will be in the details.

Good luck, and thanks for working on this.

3 Likes

Hi ho! Hi ho! Another update we go!

Changes are:

  • Implemented Botan.Bcrypt module
  • Implemented Botan.MPI module
  • Improved error handler nomenclature
  • Added mk- functions for exceptional error handlers
  • Renamed CipherFlags → CipherInitFlags

The major thing here is the Multiple Precision Integer aka a BigInt implementation. It’s pretty awkward having to initialize and set integers manually right now, but it should clean up nicely in the future when we’re giving everything more idiomatic bindings and can make it conform to the numeric classes.

Next up is private and public keys, which are frankly a bit of a beast because they have a whole bunch of algorithm-specific functions, but some of it is deprecated so perhaps it won’t be too bad :slight_smile:

4 Likes

Today’s devlog entry is spicy - it’s the start of asymmetric / public key cryptography! We can now generate keys, and sign and verify messages!

Changes include:

  • Implemented Botan.View for botan_view_ctx functions - untested
  • Renamed Botan.PasswordBotan.PwdHash for consistency
  • Moved Botan.Make to other-modules stanza
  • Started Botan.PubKey (public key cryptography) module
    • Implemented PubKey and PrivKey generation
  • Implemented Botan.PubKey.Sign and Botan.PubKey.Verify (digital signatures), tested with Ed25519 signatures

I chose Ed25519 for testing because it does not involve any extra parameters, and as with previous modules, the particular format is undocumented and I am having to go through the C++ docs which are different from the C FFI docs :grimacing: to try and figure it out. The old Z-Botan bindings are turning out to be more helpful here with providing the necessary constants, even if the bindings themselves were not as usable.

The repo of course has been updated. :coffee:


So the general state of it right now is still rather raw, but we’ve passed the halfway mark for 1:1 bindings. Here’s what we have left:

  • The rest of the PubKey operations
  • FPE (Format Preserving Encryption) functions
  • HOTP (HMAC-based One-time Password) algorithms
  • NIST Key Wrapping functions
  • SRP-6 Server protocol
  • TOTP (Time-based One-time Password) algorithms
  • X.509 certificates and revocation lists
  • ZFEC (Zero-Copy Forward Error Correction)

At a high level (eg, the entire todo-list notwithstanding), aside from the PubKey operations and X509 certs, what remains is mostly the more esoteric cryptography operations. It surprised me, but for comparison, if I were to finish implementing PubKey + HOTP, that would put these bindings roughly on par with libsodium in terms of cryptographic operation coverage, if not in ergonomics - but when we do get to doing the higher-level bindings, it is all going to clean up rather nicely.

6 Likes

It’s great to read on your progress, thanks for your work on this and on the updates!

2 Likes

Weekend update!

The changes in this update are:

  • Implemented allocBytesQuerying to Botan.Make
  • Implemented a few more Botan.PubKey functions
  • Implemented Botan.PubKey.Encrypt module
  • Implemented Botan.PubKey.Decrypt module

This one was spent mostly dealing with a bit of a head-scratcher - how do you allocate a buffer before you know how large it must be? I’ve been either calling length functions, explicitly calculating, or allocating in excess so far, but now I’ve pieced together the appropriate / relevant documentation on how the size pointers should be handled - its a little awkward, but you can pass in null instead of a buffer to query for the buffer length.

Seeing how safety of implementation is one of our critical concerns, I’ve written an allocBytesQuerying to handle this, and I’ve implemented a few modules using it as a testbed - public key encryption and decryption! I still need to go back and retrofit previously-written functions, and likely will by the next update.

Naturally, the repo has been updated :slight_smile:

5 Likes

Hah it’s fun how you end up learning the idiosyncrasies of a C library by writing bindings for it like that. Good work!

1 Like

Another update. It’s a bit smaller since ZFEC didn’t make it quite yet (multidimensional pointer arrays :grimacing:), but I got a few other things done.

Changes are:

  • Implemented Botan.HOTP (HMAC-based One-Time Password) module
  • Implemented Botan.TOTP (Time-based One-Time Password) module
  • Implemented Botan.KeyWrap (untested)

The repo has been updated.


Indeed (and thanks)! It’s been great practice, and I can feel my FFI kung fu improving steadily.

7 Likes

This is great! Thanks for the posts!

1 Like