Nothing quite substitutes for getting one’s hands dirty deep in the guts of a problem!
Indeed it would, and there are other places too for which there is unnecessary copying - it is mentioned in the TODO list in the README. For the moment, I have opted for safety over speed, but I plan on doing an optimization pass once I am satisfied that everything works.
cipherUpdate
in particular is going to require a fair bit of work to get past low-level 1:1 bindings, because its behavior is rather dependent on the algorithm. While it plays nicely with ChaCha(20)
because I can dump the entire plaintext at once, it gets rather nuanced for ciphers that have a required buffer granularity like AES-256/GCM
, which means that I’m not just going to have to optimize cipherUpdate
but also the chunking of the input by granularity as well
It is a difficult problem that I have given a lot of thought. Part of the difficulty in building high-level cryptography bindings is that many of the low-level primitives do not present a uniform interface, and many of the things that appear to be ‘primitives’ are themselves actually compound constructs. It is only at the higher levels that everything looks the same.
Botan avoids this by making the name more than just a simple enumerable value - it is a specification format that it parses into a valid construct using the parameters supplied in the format. This isn’t immediately obvious, but it can be discerned by using the name functions, which will print out the fully expanded name. Some examples:
- A
Hash
type often has several variants differing by digest length - SHA-3
is actually shorthand for SHA-3(512)
, but the format chooses default values.
- The
Mac
type takes a hash algorithm as a parameter, and so HMAC(SHA-3)
is valid, but so is HMAC(SHA-3(512))
- It is a similar or worse situation for ciphers and other constructs -
AES-256/CTR
is CTR-BE(AES-256)
, but AES-256/GCM
is really AES-256/GCM(16)
.
- Then there’s things like
AES-128/CBC/PKCS7
…
- I have not yet found a listing / description of these name formats, but have found information by going through the source code and examples.
- Even worse, not all mentioned algorithms are actually supported
- The only way to find out if an algorithm format / combo is supported is to try and construct an object and have it fail with ‘Not implemented’ if it isnt
A simple sum type is almost certainly out of the question - I’m not even sure that all algorithms are enumerable, because some may take unbounded parameters (eg, I think maybe PBKDF(n)
), others take multiple parameters (SipHash(2,4)
) .
I’m not entirely certain what I’m going to do, but my current thought process is to make it easy to generate names from a primitive or compound spec, something like:
data SHA3Spec = SHA3Spec Int -- Or even SHA3_256 | SHA3_384 | SHA3_512
data HashSpec
= SHA3 SHA3Spec
| ...
data MacSpec
= HMAC HashSpec
| ...
I’ll be focusing a lot more on this sort of thing once I complete the 1:1 low-level bindings, but right now using any of it feels really crude - and I have a fair bit of cryptography knowledge! I can see it being nigh-impossible for anyone else to use, at the moment.