Linear types are primarily intended for safety and unlifted types are partially intended for more safety, so I think it would count as a use and not an abuse.
A lot of the practical answers to this will lie in how we design our higher-level API, which will probably involve constructs akin to
withStoredPrivateKey keyRef $ \ key -> ...
, and we’ll have to build stores and atomic cryptographic operations to guide the user away from ever handling raw / exposed keys and such.
@cdepillabout and I keep the aforementioned concerns in mind when we add to or change anything in the password
library.
The Password
type explicitly has only one way of getting to the actual string of characters, which is the very obviously named unsafeShowPassword
function.
I’m also very hesitant to allow a ToJSON
instance for these exact reasons.
If anyone with crypto expertise would like to check out the library and advise on improvements, that’d also be very appreciated.
The repo has been updated with the following:
- Completed low-level bindings for
Botan.Error
,Botan.Utility
,Botan.Hash
modules - Added missing
Botan.Error
error codes, botanErrorDescription function - Implemented missing
Botan.Utility
functions for memory scrubbing, base64 encoding and decoding - Implemented missing
Botan.Hash
functions for copying state, block size - Added flags for lower-case hex encoding
- Removed
memory
dependency - see issue
Its creeping towards minimally functional, though low-level it may be - it will clean up nicely with higher-level bindings later. I think my next target may be the RNG module low-level bindings.
Another day, another module! This time, it’s Random
number generators!
The repo has of course also been updated, with the following:
- Added
Botan.Random
(the RNG interface)- The
Random
random number generator opaque type - The
RandomType
type for specifying the type ofRandom
-
randomInit
andrandomInitWith
functions to create aRandom
-
randomGet
andsystemRandomGet
functions for gettingn
bytes of random. -
randomReseed
andrandomReseedFromRandom
reseeding functions -
randomAddEntropy
function for adding your own bytes of entropy - NOTE:
botan_rng_init_custom
function is not implemented. It looks complicated, for now.
- The
- Switched a few
alloc
tomalloc
because long-lived references were being freed* - Added pure
hashWith
convenience method
* If someone could sanity check the
ForeignPtr
initialization pattern I’m using, I would appreciate it.
If there are any particular modules that you would like me to tackle next, or if you would instead like me to focus a little more on developing some higher-level bindings to Hash
/ Random
, please let me know. Otherwise, I’m following the Botan FFI header which means that message authentication codes aka MAC
would be the next module. I’m open to listening to the community on this
This is great progress! I’m really enjoying these posts!
The sun has risen again. You know what that means. The repo has been updated again.
- Added
Botan.Mac
module for Message Authentication Codes- Construction of Botan mac type strings (eg, “HMAC(SHA-256)”, “SipHash(2,4)”) is badly-documented.
- Some MACs require nonces, but this is also badly documented.
- Refined handling of foreign pointers of botan objects / initializers / finalizers
Today’s efforts are going to be focused on updating the foreign pointer handling in Botan.Hash
and Botan.Random
, and getting a better sense of what needs to be standardized - eg, whether I will continue with straight 1:1 bindings, or instead do some encapsulation, considering:
- Things like the hash / mac / rng type strings.
macInitName "HMAC(SHA-256)"
are both incredibly simple, yet incredibly awkward, and need constants at the very least. - Checking for algorithm support is also awkward (read, effectively non-existent)so I need to pick a method of handling it, and standardize that as well.
So basically a grab-bag of things that I want to take care of before we get enough modules to make them difficult.
Also, by line count, we are almost 1/4 of the way through the botan ffi header! Things do get denser towards the bottom, especially with x509
certs, but still…
@david-christiansen
I’m glad you are! Tough work is so much less of a slog when there are people rooting for you along the way, and the community’s response has really been keeping me going
This is just a minor update, with no changes pushed as of yet. The stuff I’ve been working on is more process refinement, rather than a new module - though I am using the Cipher
module as a testbed for this, the process refinement is the primary intent.
I’ve been doing some work to make implementations use a set of common initializer functions. Botan
has a consistent enough structure that many separate structures share common patterns that are reusable, or at least minimal variation.
An example of what that looks like:
data CipherStruct
type CipherPtr = Ptr CipherStruct
newtype Cipher = MkCipher { getCipherForeignPtr :: ForeignPtr CipherStruct }
withCipherPtr :: Cipher -> (CipherPtr -> IO a) -> IO a
withCipherPtr = withForeignPtr . getCipherForeignPtr
foreign import ccall unsafe botan_cipher_init :: Ptr CipherPtr -> CString -> CipherFlags -> IO BotanErrorCode
foreign import ccall unsafe "&botan_cipher_destroy" botan_cipher_destroy :: FinalizerPtr CipherStruct
foreign import ccall unsafe botan_cipher_name :: CipherPtr -> Ptr CChar -> Ptr CSize -> IO BotanErrorCode
foreign import ccall unsafe botan_cipher_output_length :: CipherPtr -> CSize -> Ptr CSize -> IO BotanErrorCode
cipherInit = mkInit_name_flags MkCipher botan_cipher_init botan_cipher_destroy
cipherName = mkName withCipherPtr botan_cipher_name
cipherOutputLength = mkOutputLength_length withCipherPtr botan_cipher_output_length
If we peek under the hood, we can see that I have just taken the pattern that we were using and formalized it as a higher-order function by lifting the concretely-typed constructor / withPtr / funPtr
s out as arguments (I am a fan of this pattern).
mkInit_name_flags constr init destroy name flags = do
alloca $ \ outPtr -> do
ByteString.useAsCString name $ \ namePtr -> do
throwBotanIfNegative_ $ init outPtr namePtr flags
out <- peek outPtr
foreignPtr <- newForeignPtr destroy out
return $ constr foreignPtr
mkName withPtr name template = withPtr template $ \ templatePtr -> do
alloca $ \ szPtr -> do
bytes <- allocBytes 64 $ \ bytesPtr -> do
throwBotanIfNegative_ $ name templatePtr bytesPtr szPtr
sz <- peek szPtr
return $ ByteString.copy $ ByteString.take (fromIntegral sz) bytes
mkOutputLength_length withPtr funPtr template length = withPtr template $ \ templatePtr -> do
alloca $ \ szPtr -> do
throwBotanIfNegative_ $ funPtr templatePtr (fromIntegral length) szPtr
fromIntegral <$> peek szPtr
Using this pattern and declaring implementations this way should result in easier-to-read and more reusable code, and help minimize the potential for bugs. It will also significantly reduce the effort required to implement and maintain this library in the long run.
I am happy to present a major update to the bindings library, which has been pushed to the repo. Most of the work was focused on systemically improving what was already written so far, but as a result, I was also able to implement several more modules!
I have:
- Created a
Guide.md
to help standardize initialization / destruction / handling of botan objects - Added convenience
as-
functions toBotan.Prelude
for bytestring pointer handling - Improved botan error code handling for positive values
- Improved
Botan.Utility
nomenclature - Created
Botan.Make
template object initialization / destruction / accessor functions to reduce boilterplate- This will greatly reduce the effort required to implement most functions
- Only ‘unique’ functions now need be implemented manually
- Minimizes attack / error surface, and improves the consistency of code
- Implemented
Botan.Cipher
as test for using newBotan.Make
functions - Re-implemented
Botan.Hash
,Botan.Mac
using make-functions - Implemented
Botan.BlockCipher
,Botan.KDF
, andBotan.Password
modules- None of algorithms for
Botan.KDF
appear to be supported by my botan installation, so it is untested
- None of algorithms for
The biggest thing is the Botan.Make
module, which lets me handle most bindings very simply - it is now possible to implement most binding functions as such:
foreign import ccall unsafe botan_cipher_set_key :: CipherPtr -> Ptr Word8 -> CSize -> IO BotanErrorCode
cipherSetKey :: Cipher -> ByteString -> IO ()
cipherSetKey = mkSetBytesLen withCipherPtr botan_cipher_set_key
This makes implementing bindings vastly more easy - as a result of this, we are almost half-way through the Botan FFI header, with the MPI (Multiple Precision Integer aka BigInt) interface being the next target - it is necessary dependency of the public / private key modules.
This sounds like a great improvement. It’s always interesting to see how working with a problem for a while brings insights that weren’t apparent in the beginning simply from looking at a spec.
I took a peek at your code for cipherUpdate
and saw that you’re using a ByteString
as an intermediate buffer for botan_cipher_update
. You then copy the content into a new ByteString
for the return value. Wouldn’t it save a heap allocation to use allocaBytes
for the intermediate buffer and then create a new ByteString
from it for the return value?
Another question: do you have plans to create a sum type for the cipher names etc. that are stringly-typed? Or is that a fool’s errand due to the maintenance overhead?
Nothing quite substitutes for getting one’s hands dirty deep in the guts of a problem!
Indeed it would, and there are other places too for which there is unnecessary copying - it is mentioned in the TODO list in the README. For the moment, I have opted for safety over speed, but I plan on doing an optimization pass once I am satisfied that everything works.
cipherUpdate
in particular is going to require a fair bit of work to get past low-level 1:1 bindings, because its behavior is rather dependent on the algorithm. While it plays nicely with ChaCha(20)
because I can dump the entire plaintext at once, it gets rather nuanced for ciphers that have a required buffer granularity like AES-256/GCM
, which means that I’m not just going to have to optimize cipherUpdate
but also the chunking of the input by granularity as well
It is a difficult problem that I have given a lot of thought. Part of the difficulty in building high-level cryptography bindings is that many of the low-level primitives do not present a uniform interface, and many of the things that appear to be ‘primitives’ are themselves actually compound constructs. It is only at the higher levels that everything looks the same.
Botan avoids this by making the name more than just a simple enumerable value - it is a specification format that it parses into a valid construct using the parameters supplied in the format. This isn’t immediately obvious, but it can be discerned by using the name functions, which will print out the fully expanded name. Some examples:
- A
Hash
type often has several variants differing by digest length -SHA-3
is actually shorthand forSHA-3(512)
, but the format chooses default values. - The
Mac
type takes a hash algorithm as a parameter, and soHMAC(SHA-3)
is valid, but so isHMAC(SHA-3(512))
- It is a similar or worse situation for ciphers and other constructs -
AES-256/CTR
isCTR-BE(AES-256)
, butAES-256/GCM
is reallyAES-256/GCM(16)
. - Then there’s things like
AES-128/CBC/PKCS7
… - I have not yet found a listing / description of these name formats, but have found information by going through the source code and examples.
- Even worse, not all mentioned algorithms are actually supported
- The only way to find out if an algorithm format / combo is supported is to try and construct an object and have it fail with ‘Not implemented’ if it isnt
A simple sum type is almost certainly out of the question - I’m not even sure that all algorithms are enumerable, because some may take unbounded parameters (eg, I think maybe PBKDF(n)
), others take multiple parameters (SipHash(2,4)
) .
I’m not entirely certain what I’m going to do, but my current thought process is to make it easy to generate names from a primitive or compound spec, something like:
data SHA3Spec = SHA3Spec Int -- Or even SHA3_256 | SHA3_384 | SHA3_512
data HashSpec
= SHA3 SHA3Spec
| ...
data MacSpec
= HMAC HashSpec
| ...
I’ll be focusing a lot more on this sort of thing once I complete the 1:1 low-level bindings, but right now using any of it feels really crude - and I have a fair bit of cryptography knowledge! I can see it being nigh-impossible for anyone else to use, at the moment.
Wow, thanks for the detailed reply!
I noticed from your example that there was probably some kind of hierarchy in the cipher name. It’s a lot more complex than I realized, however. Your idea for using the full richness of ADTs to model the possibilities seems good, but I’m sure the devil will be in the details.
Good luck, and thanks for working on this.
Hi ho! Hi ho! Another update we go!
Changes are:
- Implemented Botan.Bcrypt module
- Implemented Botan.MPI module
- Improved error handler nomenclature
- Added mk- functions for exceptional error handlers
- Renamed CipherFlags → CipherInitFlags
The major thing here is the Multiple Precision Integer aka a BigInt implementation. It’s pretty awkward having to initialize and set integers manually right now, but it should clean up nicely in the future when we’re giving everything more idiomatic bindings and can make it conform to the numeric classes.
Next up is private and public keys, which are frankly a bit of a beast because they have a whole bunch of algorithm-specific functions, but some of it is deprecated so perhaps it won’t be too bad
Today’s devlog entry is spicy - it’s the start of asymmetric / public key cryptography! We can now generate keys, and sign and verify messages!
Changes include:
- Implemented
Botan.View
for botan_view_ctx functions - untested - Renamed
Botan.Password
→Botan.PwdHash
for consistency - Moved
Botan.Make
to other-modules stanza - Started
Botan.PubKey
(public key cryptography) module- Implemented
PubKey
andPrivKey
generation
- Implemented
- Implemented
Botan.PubKey.Sign
andBotan.PubKey.Verify
(digital signatures), tested with Ed25519 signatures
I chose Ed25519
for testing because it does not involve any extra parameters, and as with previous modules, the particular format is undocumented and I am having to go through the C++ docs which are different from the C FFI docs to try and figure it out. The old Z-Botan
bindings are turning out to be more helpful here with providing the necessary constants, even if the bindings themselves were not as usable.
The repo of course has been updated.
So the general state of it right now is still rather raw, but we’ve passed the halfway mark for 1:1 bindings. Here’s what we have left:
- The rest of the PubKey operations
- FPE (Format Preserving Encryption) functions
- HOTP (HMAC-based One-time Password) algorithms
- NIST Key Wrapping functions
- SRP-6 Server protocol
- TOTP (Time-based One-time Password) algorithms
- X.509 certificates and revocation lists
- ZFEC (Zero-Copy Forward Error Correction)
At a high level (eg, the entire todo-list notwithstanding), aside from the PubKey operations and X509 certs, what remains is mostly the more esoteric cryptography operations. It surprised me, but for comparison, if I were to finish implementing PubKey + HOTP, that would put these bindings roughly on par with libsodium
in terms of cryptographic operation coverage, if not in ergonomics - but when we do get to doing the higher-level bindings, it is all going to clean up rather nicely.
It’s great to read on your progress, thanks for your work on this and on the updates!
Weekend update!
The changes in this update are:
- Implemented
allocBytesQuerying
toBotan.Make
- Implemented a few more
Botan.PubKey
functions - Implemented
Botan.PubKey.Encrypt
module - Implemented
Botan.PubKey.Decrypt
module
This one was spent mostly dealing with a bit of a head-scratcher - how do you allocate a buffer before you know how large it must be? I’ve been either calling length functions, explicitly calculating, or allocating in excess so far, but now I’ve pieced together the appropriate / relevant documentation on how the size pointers should be handled - its a little awkward, but you can pass in null instead of a buffer to query for the buffer length.
Seeing how safety of implementation is one of our critical concerns, I’ve written an allocBytesQuerying
to handle this, and I’ve implemented a few modules using it as a testbed - public key encryption and decryption! I still need to go back and retrofit previously-written functions, and likely will by the next update.
Naturally, the repo has been updated
Hah it’s fun how you end up learning the idiosyncrasies of a C library by writing bindings for it like that. Good work!
Another update. It’s a bit smaller since ZFEC
didn’t make it quite yet (multidimensional pointer arrays ), but I got a few other things done.
Changes are:
- Implemented Botan.HOTP (HMAC-based One-Time Password) module
- Implemented Botan.TOTP (Time-based One-Time Password) module
- Implemented Botan.KeyWrap (untested)
The repo has been updated.
Indeed (and thanks)! It’s been great practice, and I can feel my FFI kung fu improving steadily.
This is great! Thanks for the posts!