Great work Leo and collaborators! It’s exciting to see this work being done.
Props to @BurningWitness and @ocramz for the PRs, and of course to @ApothecaLabs for the continued effort
That’s a pretty great update. I maintain a small (nearly-irrelevant) package which does FFI bindings. I’m going to have to play with CApiFFI
and those CTYPE
annotations, and try and modernise it.
Holiday Update: Fixing botan-low
for the new CAPI changes
The last update was a good one, but it broke a few things in botan-low
- hence why I’ve been working off of the CAPI-experiments
branch. This update is about as big, and once again touches every module in a library - this time botan-low
. It’s a bit late, so I’ll keep this short and sweet.
Change log / greatest hits
-
botan-low
now uses the newer better data types and theirnewtype
constructors - The
mkBindings
generator function has been moved tobotan-low
and expanded - Initializers now use generated
createObject
functions - Destructors are now generated
- Functions now use
ConstPtr
where appropriate -
FooCtx
-style objects have been renamedFoo
-
withFooPtr
-style functions have been renamedwithFoo
-
mkFoo
functions inBotan.Low.Make
are being re-worked - Suite of more consistent marshalling functions are being developed
- “Fixed” unit tests to the point of compiling if the XFFI test modules are deleted and removed from the cabal file (havent actually scrutinized the results yet)
-
botan
is still broken
Whew! That covered a lot and I’m still testing everything (in fact, the unit tests are still broken, but that’s because hspec-discover
doesn’t respect our if flag(XFFI)
and tries to include modules that won’t exist), so this update is still living on the CAPI-experiments
branch for the time being.
There’s more work to do on botan-low
but I’m aiming to get it release-stable like how botan-bindings
effectively is - that’s all for now!
This update has been pushed to the CAPI-experiments branch branch.
New Year’s First Update - Merging CAPI-experiments
back into main
Its the first update of the new year! And I bring some good things!
The first big point is the CAPI-experiments
branch has become stable enough that I have merged the changes back into main
. That means all of the recent CAPI
good-ness has now made it back in, effectively freezing botan-bindings
minus any additive changes.
Second big point is that I’ve added a bunch of pattern constants for algorithm names to botan-bindings
and botan-low
to match the data types in botan
.This makes the lower-level libraries a tad more useful on their own, and will help guard against stringly-typed errors.
As an example, we might have the following in botan-bindings
:
pattern BOTAN_BLOCK_CIPHER_128_AES_128
:: (Eq a, IsString a) => a
pattern BOTAN_BLOCK_CIPHER_128_AES_128 = "AES-128"
pattern BOTAN_AEAD_CHACHA20POLY1305
:: (Eq a, IsString a) => a
pattern BOTAN_AEAD_CHACHA20POLY1305 = "ChaCha20Poly1305"
pattern BOTAN_AEAD_MODE_GCM
:: (Eq a, IsString a) => a
pattern BOTAN_AEAD_MODE_GCM = "GCM"
But express it in botan-low
more succinctly:
type BlockCipher128Name = ByteString
pattern AES128
:: BlockCipher128Name
pattern AES128 = BOTAN_BLOCK_CIPHER_128_AES_128
type AEADName = ByteString
chaCha20Poly1305 :: AEADName
chaCha20Poly1305 = BOTAN_AEAD_CHACHA20POLY1305
gcmMode :: BlockCipher128Name -> AEADName
gcmMode bc = bc // BOTAN_AEAD_MODE_GCM
-- With *extended* algorithm parameters
gcmMode' :: BlockCipher128Name -> Int -> AEADName
gcmMode' bc tagSz = gcmMode bc /$ showBytes tagSz
-- Formatting helpers
infixr 6 //
(//) :: (IsString a, Semigroup a) => a -> a -> a
a // b = a <> "/" <> b
infixr 0 /$
(/$) :: (IsString a, Semigroup a) => a -> a -> a
a /$ b = a <> "(" <> b <> ")"
And even further, give it a proper data type in botan
:
data AES
= AES128
...
data BlockCipher128
= AES AES
...
data BlockCipher
= BlockCipher128 BlockCipher128
...
-- NOTE: Current datatype in `botan` does differ slightly
data AEAD
= ChaCha20Poly1305
| GCM BlockCipher128 (Maybe Int)
...
I’ve tried to strike a balance between safety and exposing functionality by providing constants for algorithm families and modes, without making a constant for every combination. I’ve yet to make the botan
data types now use the botan-low
constants and functions properly, but I’ll be working on that shortly.
NOTE: I haven’t used
Ptr "TheConst\0"#
because most algorithms have parameters, and we need to glue the parts together for compound algorithms, and that would be thousands of constants if we unrolled that all.
Third big point is that I’ve split the unit tests up into individual targets so that its easier to see what’s passing and failing - no longer will cipher unit tests spam everything with test failures! This is great because I can just test everything with:
cabal test botan-low
Or I can test specific things via
cabal test botan-low-hash-tests
I’ve also fixed the unit tests - some were broken by being split into separate targets, but I had to get the pattern constants in first as preparation. However, this also means that tests can no longer use hspec-discover
- 1) because it doesn’t respect flags and 2) each file is getting its own test-suite anyway - alas!
Unit tests also now (somewhat) use the new algorithm name constants, though I’m still working on that - I’ve tried to be careful, but I’ll be scrutinizing these changes quite severely as I go over them again a few times in coming days.
Last big point is that I’ve also gotten botan
building again with all of the changes in the last month or so that broke it. It had been a while since I last poked around the high-level library since so much attention has gone towards botan-bindings
, botan-low
, and botan-upstream
, but I was pleasantly surprised by how I had left things.
I still have a few things to do before I can declare botan-low
stable enough for versioning like botan-bindings
, however - I need to go over everything with a fine-toothed comb:
- Find and fix any missing constants
- Implement the view-bin and view-str functions
- Clean up unit tests especially algorithm test suite generators
- Consolidate the
Make
andRemake
files
- Consolidate the
- The mystery of why certain unit tests started passing
- Probably was the cipher tests because of the simplified set of algorithms that we’re testing against (we’ll leave extensive testing to
botan
)
- Probably was the cipher tests because of the simplified set of algorithms that we’re testing against (we’ll leave extensive testing to
- Move some of the non-canonical functions from
botan-low
tobotan
- I’m specifically thinking of the cipher online-vs-offline functions and the hash convenience functions
- A little bit of nomenclature standardization
-
BOTAN_BINDINGS_FOO_VAL
vsLowFooVal
vs (high-level)val
-
But aside from those things, botan-low
has arrived near some critical level of stability, much like botan-bindings
. Now that the lower libraries aren’t going to be changing as much, and the roles of the various library levels are more clear, I’m effectively playing ‘fill in the blanks’, and the majority of my focus can be on getting the high-level bindings in botan
to that sort of state too, now.
With all of the preparation we’ve made, I hope that it should go fairly quick.
All-in-all, I’m really relieved to be getting all of this merged back into main
. There’s a lot of modules to juggle, so it’s nice when it all aligns nicely and everything builds as it should - botan-bindings
, botan-low
, botan
, and all of the botan-low-*-tests
.
These changes have been merged to main
and pushed to the github repo.
That’s it for now, though you should check back soon for the incipient monthly status report!
Update: Work on botan
recommences!
Now that the lower level libraries are more or less stable, I’ve had the pleasure of being able to focus back on the high-level botan
library, with its idiomatic and pure-as-possible interface. I have made several large strides in the background while writing the latest monthly update, since I am finally getting to work a bunch of fun stuff that had to be put off until now.
First off, Botan.RNG
has been strongly refined, and use of random generators has been ensmoothened with the introduction of the MonadRandomIO
monad and the RandomT
monad transformer (which is an instance of MonadRandomIO
). Getting this right is essential - because of the many functions that rely on it, it affects the ergonomics of the rest of the library.
Now, there are two ways of accessing randomness:
- Directly using an
RNG
context - Implicit access to an
RNG
context usingMonadRandomIO
Direct usage is the old way, and it pretty much looks like this:
main = do
rng <- newRNG Autoseeded
addEntropyRNG "Fee fi fo fum!" rng
reseedRNG 32 rng
x <- getRandomBytesRNG 12 rng
print x
Implicit access to an RNG through MonadRandomIO is the new way. IO
is itself a convenient instance of MonadRandomIO
that
uses the systemRNG
:
main = do
addEntropy "Fee fi fo fum!"
x <- getRandomBytes 12
print x
It is also possible to use the RandomT
transformer or RandomIO
monad (currently a typealias for ReaderT RNG
and ReaderT RNG IO
respectively):
main = do
rng <- newRNG Autoseeded
flip runRandomIO rng $ do
addEntropy "Fee fi fo fum!"
x <- getRandomBytes 12
liftIO $ print x
I’m not exactly sure how MonadRandomIO
will change under the hood (RandomT
is probably going to become a newtype at least) but basically, any functions that need random values, or that take an RNG as an argument, can now be MonadRandomIO
instead - this should make it really easy to generate keys and whatnot, and reduce the number of arguments in general.
For example, with MonadRandomIO
, bcryptGenerate
now only takes 2 parameters!
main = do
dg <- bcryptGenerate "Fee fi fo fum!" Fast
print dg
Secondly, all of the algorithm name constants and functions have been completely vertically integrated, from botan-bindings
to botan-low
to botan
. This makes using a given algorithm much easier and more consistent no matter what level of library you are using.
Thirdly I’ve created a KeySpec class for a better representation of what keys sizes are available for a given primitive. Its similar to the KeySpecifier
from crypton
. I’m probably going to rename it.
Fourthly, in the Botan C FFI, you have to initialize a cipher or hash or mac context in order to query its sizes, but this data is constant and initializing a context is not free, so I’ve been writing pure / static versions of key spec / block size / tag length / other size query functions so that I can just say:
let bsz = hashBlockSize MD5
Instead of:
bsz <- do
ctx <- Low.hashInit (hashName MD5)
Low.hashBlockSize ctx
This is just overall a much better experience, as we can get algorithm parameters by referencing the algorithm itself instead of referencing an initialized context. It is really helpful for generating keys even if you aren’t looking to use them immediately.
Lastly, CI is working - at least for MacOS. I need to test manual Botan C++ installation for Linux, as I’ve determined that botan 3.x
packages haven’t been published yet, but that seems to be the only issue at the moment!
As always, these changes have been pushed to the repo
Lastly, CI is working - at least for MacOS. I need to test manual Botan C++ installation for Linux, as I’ve determined that
botan 3.x
packages haven’t been published yet, but that seems to be the only issue at the moment!
Which Linux distro were you looking at? It looks like Gentoo’s shipping botan-3.2.0 right now, at least for amd64 and ppc/ppc64
I was using the latest Ubuntu (23.10), and it turns out that that’s the culprit - Botan 3.x+ is too recent to be in the 22.04 LTS and apparently hasn’t made it into 23.10 yet. So Ubuntu needs to follow manual installation for now; in other words, the CI can be fixed rather simply and it’s on my todo list!
Update: botan
lurches to life!
Hello everyone, I’m sorry that I’m a bit late. I’ve been doing a lot of thinking, and it can be a bit difficult to post while that is still underway - I am the epitome of ‘think before you speak’, taken to pathalogical extremes
The botan-low
interface works, so if you don’t mind putting up with its clunky low-level interface, you can have at it already to get stuff working - but the whole point of the higher-level libraries is a better interface, so its worth thinking heavily about.
While stewing about, I’ve been able to do a lot of the rather mechanical but still useful work in getting the high-level botan
library up and running - regardless of what interface I eventually choose to express, there are certain things to be done along the way, so I might as well get them out of the way.
As a result, if you’ve gotten used to anything in the highest-level botan
library, I’m afraid its changed heavily. Some of the modules that existed prior have been reworked for consistency, and the others will be too soon enough.
Most of the modules are nominally completed:
Botan.RNG
Botan.Bcrypt
Botan.BlockCipher
-
Botan.Cipher
minus lazy / online processing Botan.Hash
Botan.KeySpec
Botan.KDF
Botan.MAC
Botan.PubKey
Botan.PubKey.Load
Botan.PubKey.Encrypt
Botan.PubKey.Decrypt
Botan.PubKey.Sign
Botan.PubKey.Verify
Botan.SRP6
Other modules are underway as I investigate the various cruft and minutia that’s piled up. I’ve found some gnarly issues with PubKey.Sign
since not every pubkey algo works with every signing algo, and I’m seeing some odd things like PEM signatures only working in unit tests and DER signatures working only in GHCI. I’ll slog through it eventually, but for now, try not to stray from the established path, for here be dragons!
This represents our first stab at an idiomatic interface, so it isn’t great, but it isn’t terrible. I’ve mostly made things pure where I could, MonadIO
or MonadRandomIO
where I couldn’t. I’ve tried to give everything proper types as it helps me smooth the course and see what idioms need to be applied in a given case, but now that it’s here, I know that I can do better. There’s still a lot of other things to get through, but I’ve got a whole checklist and templates to speed all of it up.
If you want to take a look at the sort of increased ergonomics I have in mind, check out the source code to
Botan.SRP6
- the exposed interface is mostly a straightforward translation of thebotan-low
interface, but if you scroll down further, you can see I’ve been building server and client session logic to abstract away a lot of the fiddly bits - I want to do that for all modules, as appropriate!
This of course leaves us with a lot of decisions to make - those pesky things that I’ve been thinking about - so by no means is anything in botan
considered stable yet. We have a stable low-level library which works functionally, and now we are trying to get the ergonomics right:
-
Decision: FooType and Foo vs Foo and MutableFoo
-
Foo
type andMutableFoo
context, vsFooType
type andFoo
context - RNG is the exception? Why? (Because the context is not mutating, it is non-deterministic)
- Current decision:
Foo
type andMutableFoo
context; reasoning: the highest-level interface gets the simplest names
-
-
Decision:
- Flatten SHA / etc variants into larger types?
- Or define function aliases? Eg
sha3 = Cryptohash $ SHA_3 SHA3_512
- Current decision: Undecided
- The more I have to use it in its current state, the more it annoys me…
-
Decision:
- Terminology: algorithms with default
-
Low
usesfoo
for default / argless andfoo'
with an apostrophe for the function with args - Might switch to
fooDefault
for default, andfoo
- Current decision: Undecided
-
Decision:
- Classes for things like HasKeySpec / KeySettable, HasBlockSize, etc
- Current decision: No. It is probably best left for a higher-level classy library, can make botan instances.
-
Decision:
- Use Enum + Bounded for algos?
- Initial solution: ADT trees + functions (manual solution eugh)
- Current decision: None, defaulting to all of the
fooName
functions…
-
Decision:
- Designed for qualified or unqualified import?
- Probably want unqualified import for high level, but what about Mutable?
-
Decision:
- How to treat nonced MACs (GMAC and Poly1305) vs non-nonced MACs (deterministic to the key+text)?
- We can apply a
MonadRandomIO
constraint andignore the differencereturn the nonce too, but it is unnecessary for most MACs - Initial solution: data MAC = DeterminsticMAC DeterministicMAC | NonceMAC NonceMAC
- Afterthought:
setMACNonceIfNeeded :: (MonadRandomIO m) => MutableMAC -> m ()
? - Current solution: Only GMAC is actually nonced (Poly1305 folds it into the key),
so GMAC gets gmac-specific functions - Or we could do what nacl / saltine do and make Poly1305 a distinct OneTimeAuth / MAC
- We could then pull out GMAC as non-deterministic MAC similarly
-
Decision:
- FooSize vs FooLength
- Eg, DigestSize vs DigestLength, BlockSize vs BlockLength
- Initial solution: Undecided, but considering standardizing on ‘Size’
- Currently standardizing on:
Size
for algorithm components,Length
for (plain- and crypt-) texts
-
Decision:
- Split
MutableCipher
intoMutableEncipher
andMutableDecipher
? - Would be consistent with PK encrypt, sign
- Would obviate
CipherDirection
type - Current decision: Keeping track of encrypt vs decrypt in the mutable context
- Split
-
Decision:
- Create a
NonceSpec
data type? Or generalize toSizeSpec / SizeSpecifier
- Current decision: Unconsidered, still using
validNonceSize
anddefaultNonceSize
functions
- Create a
-
Decision:
- Terminology:
validKeySize
vsdefaultKeySize
- Also for nonces
- Also size vs length (size for elements, length for message / ciphertext?)
- Size slightly implies (a somewhat) fixed value, whereas length is more instanced
- Leaning towards
defaultFooSize :: alg -> Int
andvalidFooSize :: alg -> Int -> Bool
- Terminology:
-
Decision:
- Error handling
- Some (mutable) functions are failable if not used in the proper order
- Some (mutable) functions are failable because the specific algorithm lacks support
- Some (user) errors are not fatal (eg, setFooKey with incorrect key length)
- Should we catch the error and return a bool?
- How do we express this?
- Current status: Allowing exceptions to be thrown - are exceptions satisfactory here?
- Error handling
-
Decision:
- Cipher (and other processing algorithms) need to conform to a consistent interface
but the APIs have differences - Example: Nonces in ciphers as an argument to
cipherStart
vs (G)MACset(G)MACNonce
- Mostly affects only the mutable interface, in that they may have a required order
or some other peculiarity that is only visible to the internals - The use of a nonce is associated with that specific instance of processing,
andsetNonce
is more free-er thanstart(WithNonce)
which has a specific order of use. - We could imagine the other cryptoprocesses as having a no-op
start
function
- Mostly affects only the mutable interface, in that they may have a required order
- Arguably, the highest-level API should take all of the arguments, such that the implementation’s
order of application doesn’t matter.
- Cipher (and other processing algorithms) need to conform to a consistent interface
-
Decision:
- Clarify ‘clear’ vs ‘reset’ consistently
- Some algorithms only have ‘clear’, but others have a more limited ‘reset’ that preserves keys
-
Decision:
- Push the higher MutableFoo terminology down to
botan-low
, eg (eg,setFooBar
instead offooSetBar
) - Current status: Pondering, no need for the churn at the moment
- Push the higher MutableFoo terminology down to
-
Decision:
- Collapse modules together?
- HOTP + TOTP = OTP
- No PubKey submodules?
- Specificity vs ease of use
- High-level libraries focus on ease of use, the low-level libraries are quite specific;
there is benefit to collapsing these modules inbotan
- Addendum: Collapsed all the pubkey algorithm-specific module down to
Botan.PubKey.Load
- Collapse modules together?
-
Decision:
- What to call
PubKey
? - The module name references Public Key Cryptosystems, of which public keys are a component.
- Possibly rename it
PKC
orCryptoSystem
(or shouldCryptoSystem
be the generalized concept, not just public keys?) - With the
PKC
namespace, it may make a bit more sense to movePrivKey
andPubKey
under it a la:Botan.PKC.PubKey
Botan.PKC.PrivKey
Botan.PKC.Encrypt
Botan.PKC...
- What to call
-
Decision:
- Use of MP in APIs?
- Can take Integer instead, eschew Botan.MPI entirely
- Result: Yeah, definitely elide MP entirely in favor of Integer
-
Decision:
- Elide pointless accessors (such as length queries that have already been used in Botan.Low)
-
Decision:
- How to represent PubKey types that are only used for specific operations
- That PubKeys require an algo and params causes issues the current setup for pk operations
- Example: What signing algos are usable is dependent on what pubkey is use
-
Decision:
- How to deal with the gnarly algorithm hierarchies?
- Break algorithms up into individual data type, and use classes?
- Currently we end up with stuff like:
pkSign rsa (EMSA $ EMSA4 (Cryptohash $ SHA2 SHA512) Nothing) ...
as opposed to something likepkSign rsa (EMSA4 SHA512)
- Currently we end up with stuff like:
- Would require cryptographic classes (crypto-schemes) and botan instances
- Decision: not yet
- Much longer scope
-
Decision:
- Botan’s unified data types are a mixed bag - convenience, but problems too
- functions with keys can fail if an incorrect key is used, eg:
mac :: MAC -> MACKey -> ByteString -> Maybe MACDigest
- algorithm-specific keys can be assumed valid and thus we can get rid of the Maybe:
sha512hmac :: SHA12MACKey -> ByteString -> MACDigest
- Right now we use exceptions if the key is incorrectly sized, should we keep doing that?
- Or should we convert all of these exceptions to Maybe?
- Or should we expose algorithm-specific functions?
- Current result: Mostly still throwing exceptions
So yeah, a lot. It is taking shape though, and just re-reading the list of decisions-to-be-made as a whole helps give me a direction though, and you can of course provide any feedback you might have.
As always, the repo has been updated.
Wow, there’s lots to still be done it seems
A few things that pop into my head after reading this and looking through the SRP6
and other modules:
- Please use
newtype
for any types that will be function arguments. The worst thing to happen is switching upByteString
arguments and not knowing until runtime that you’re using the password as the salt, or some other mixup. - Exporting
SRP6Salt(..)
does nothing if it’s a type synonym, right? Might even produce some warnings? Also will export any constructors if you change it to anewtype
, which might not be desireable. - As a general rule, I’d probably not export any constructors of types that shouldn’t be directly fiddled with, as I expect that will be the case in a lot of modules for crypto functionality.
- I’m not very knowledgeable on all the different permutations/combinations of hashes and algorithms, but having separate functions for separate combinations would make for a more pleasant API, IMHO.
i.e. make asha512hmac
if that provides an API where it is guaranteed to work. This way, you’d also have sections in the documentation for every hash/algorithm, so you can more accurately provide the caveats and other side-notes that come with each hash/algorithm combination. - I see here that the
HMAC
constructor takes aHash
, but that the comment says it should never be aChecksum
, so should it just take aCryptoHash
? - Is
SipHash 2 4
the only validSipHash
? Might make more sense to just name the constructorSipHash24
then? - I personally would like that these functions never throw exceptions, but please return
Maybe
orEither
if I have to be ready for them to fail. Though setting up the API so that non-failing combinations are easy and type-safe should be a priority, I feel.
I have lots of opinions on making good developer experiences with nice APIs, but I just don’t know what all the options are
I’d be happy to go through the botan
library with you once it’s more “done” to help with nailing down the API and/or to do some brainstorming in general.
Oh there’s always more work to be done, especially now that we have some choice regarding how we should implement things. I’m just glad to have someone else following along as a sanity check, and I think I agree with most if not all of your points - you’ve highlighted a few items that are burning a hole in my todo list.
Decide a few of the right things and it all falls into place soon enough, but it is easier to decide when others share their opinion. Thanks
A Classy Update
Decisions are like dominos, knock a few over and the rest come tumbling down.
After some feedback, and a lot of playtesting, it has become clear that the algorithm ADT trees are terribly unwieldy, and not at all the sort of interface that I’d envisioned when setting out on this project. In response, I’ve come to a decision:
ADTs were better than raw strings or constant patterns, but now they are getting in the way - expressions like AEAD $ GCM (BlockCipher128 AES_256) 16
and Cryptohash $ SHA3 $ SHA3_512
are awfully frustrating to read and use. I’m (eventually) axing the algorithm ADTs, in favor of a better approach.
I was initially following z-botan
's lead which was helpful at first - however, we are not beholden to that format. Additionally, with the need to add support for BOTAN_HAS_
conditional defines for individual algorithms, the ADT approach makes less and less sense.
Instead, I am proposing a classier interface that uses data families to ensure type isolation and inference. Originally, I was planning on working on this interface as a separate cryptography
library (originally called crypto-schemes
but that sounds too nefarious), and then making botan
conform to it in a separate cryptography-botan
library. However, at this point it seems more sensible to just skip the extra step of a separate library, and just implement the conformances in botan
itself, while developing cryptography
inside of botan
to be extracted as a separate library later.
As a result, this update is focused heavily on these new typeclasses:
- Botan.BlockCipher.Class
- Botan.Cipher.Class
- Botan.Hash.Class
- Botan.MAC.Class
- Botan.OneTimeAuth.Class
The new classes are something like:
data family SecretKey alg
data family Ciphertext alg
class BlockCipher bc where
blockCipherEncrypt :: SecretKey bc -> ByteString -> Maybe (Ciphertext bc)
blockCipherDecrypt :: SecretKey bc -> Ciphertext bc -> Maybe ByteString
data family Nonce alg
class Cipher c where
cipherEncrypt :: SecretKey c -> Nonce c -> ByteString -> Ciphertext c
cipherDecrypt :: SecretKey c -> Nonce c -> Ciphertext c -> Maybe ByteString
data family Digest alg
class Hash h where
hash :: ByteString -> Digest h
data family Auth alg
class MAC m where
auth :: SecretKey m -> ByteString -> Auth m
data family OneTimeAuth alg
class OTA ota where
oneTimeAuth :: SecretKey ota -> Nonce ota -> ByteString -> OneTimeAuth ota
This isn’t exactly how they are (still being) implemented, but its an accurate enough representation. Other algorithms and modules having multiple data families are slightly more complicated to write, but are coming soon, pending some more data family work. I have tried to create a proof-of-class implementations of at least one algorithm per class type, to show that it functions as intended:
- Botan.BlockCipher.AES
- Botan.Cipher.ChaCha20Poly1305
- Botan.Hash.SHA3
- Botan.MAC.CMAC
- Botan.OneTimeAuth.Poly1305
A gold-star example of a relatively finished algorithm module (and the effectiveness of the approach) would be Botan.Hash.SHA3
, which we can explore:
import Botan.Hash.SHA3
It has per-algorithm -level functions:
sha3_512 "Fee fi fo fum!"
-- 03a240a2...
It also has algorithm-family -level functions that can use TypeApplications
to select specific variants:
sha3 @512 "Fee fi fo fum!"
-- This produces the same digest as before
Explicit typing also works:
sha3 "Fee fi fo fum!" :: SHA3Digest 512 -- Or SHA3_512Digest
These functions are implemented via a more generic, classy Hash
interface which uses the Digest
data family to ensure that different algorithms and variants have different types while still being inferred properly.
import Botan.Hash.Class
:i Hash
-- class Hash h where
-- hash :: ByteString -> Digest h
:i Digest
-- data family Digest h
We can allow our hash algorithm to be parametric using hash
, while still using type applications or inference to select our specific algorithm:
-- Once more at the class-level
hash @(SHA3 512) "Fee fi fo fum!"
-- Once more with explicit typing
hash "Fee fi fo fum!" :: Digest (SHA3 512)
The other classes work for at least one algorithm, but at the moment it might require a bit of unsafeCoerce
to turn bytestrings into keys, while I get better support for that sort of thing underway.
Here’s CMAC AES128
:
import Botan.MAC.Class
import Botan.MAC.CMAC
import Botan.BlockCipher.AES
import Botan.RNG
import Unsafe.Coerce
k <- getRandomBytes 16
mac @(CMAC AES128) (unsafeCoerce k) "Fee fi fo fum!"
-- 7989fb40105646e975311785efae3048
And here’s the ChaCha20Poly1305
cipher
import Botan.RNG
import Botan.Cipher.Class
import Botan.Cipher.ChaCha20Poly1305
import Unsafe.Coerce
k <- getRandomBytes 32
n <- getRandomBytes 12
ct = cipherEncrypt @ChaCha20Poly1305 (unsafeCoerce k) (unsafeCoerce n) "Fee fi fo fum!"
-- 2b0c0e4e332b4214d3c939b0d1af90a89167d914df538f6cdc364371dd8d
pt = cipherDecrypt @ChaCha20Poly1305 (unsafeCoerce k) (unsafeCoerce n) ct
-- Just "Fee fi fo fum!"
Other classes and data families will be quite similar. Notably, we avoid passing around an explicit algorithm witness / proxy, but remain type-injective due to the data families, and only one call site is required for inference to work. It is also clear that this approach will be very amenable to TemplateHaskell
in the future. And don’t forget, eventually, these classes will be pulled out into a backend-agnostic cryptography
library.
I’m still currently working on some support classes for data families in Botan.Types.Class
, such as Encodable
and SecretKeyGen
and NonceGen
which have not yet been applied to the aforementioned cryptography classes but will provide the necessary support to make writing data family instances much easier. If you’ve used saltine
or cryptonite
, you’ll recognize their influence.
I would like some feedback from the community on this - it does delay publishing to hackage as well as writing tutorials, as things still shift around a bit.
As always, this has been pushed to the repo.
I’m wondering if it’d be an ok API if you’d have the Enums in their horribly wordy state, but then:
- make newtypes for every section that only accepts parts of certain algos/etc
- don’t export any way to create that newtype
- EXCEPT for a group of pattern synonyms that contain all the valid ones; and
- have a
{-# COMPLETE #-}
pragma to tell GHC the provided patterns are all that are valid
Perhaps ‘axing’ is a tad strong - lets say, relegated to the status of ‘prefer the other interface’ along side the mutable interface, in favor of this (hopefully) better new approach.
a group of pattern synonyms that contain all the valid ones
One of the difficulty of creating pattern synonyms for valid algorithm combinations is that there’s literally thousands of them - I’d have to rely on TemplateHaskell
in order to make that feasible. I first ran into the issue with unit testing when the sheer number of tests (one per combination) caused the unit tests to terminate without outputting the results - this is why there are so many individual unit test targets now
Note that although I can’t provide convenient aliases for every algorithm combination, I certainly can make sure to do that for all priority / best-in-class algorithms as I did in Botan.Hash.SHA3
, which exports not just the family sha3
but also all of the individual SHA3 variants (sha3_224
, sha3_256
, sha3_384
, sha3_512
).
make newtypes for every section that only accepts parts of certain algos/etc
Things have already been improved beyond yesterday’s update. With the new data families, we are accomplishing exactly this
For example, AES128SecretKey
(formerly AES128CipherKey
, but ) is now actually a newtype
wrapper around a GSecretkey
type from which I can automatically derive everything.
First, GSecretKey
is itself a newtype
wrapper around Bytestring
:
newtype GSecretKey = MkGSecretKey { unGSecretKey :: ByteString }
deriving newtype (Eq, Ord, Encodable)
Then I then wrap the algorithm-specific newtype around that, plus a pattern and function to hide the GSecretKey
:
newtype instance SecretKey AES128 = MkAES128SecretKey GSecretKey
deriving newtype (Eq, Ord, Show, Encodable)
pattern AES128SecretKey :: ByteString -> SecretKey AES128
pattern AES128SecretKey bytes = MkAES128SecretKey (MkGSecretKey bytes)
getAES128SecretKey :: SecretKey AES128 -> ByteString
getAES128SecretKey (AES128SecretKey bs) = bs
And now I can just say:
newSecretKey @AES128
-- 591b3de67f882893a11af874fbf40bdd
Of course, this won’t be just for SecretKey
and BlockCipher
- I’m doing the same with other algorithms and supporting data types such as nonces, ciphertext, mac codes, salts, wherever appropriate.
It’ll all be out in the next update - I’ve already made some mad big strides since yesterday’s update (those dominoes keep tumbling down, and I’ve been in a happy little coding groove for the last few days.
Update: Families!
After a serious shakedown with the recent classy update, things have really settled out.
-
Botan.Types.Class
now provides reusable data familiesSecretKey
,Nonce
,Digest
,Ciphertext
,LazyCiphertext
for common cryptography components - Also provides
Has[Foo]
/Is[Foo]
typeclasses andG[Foo]
generic types for data families -
KeySpec
has been generalized toSizeSpecifier
inBotan.Types.Class
-
Botan.Types.Class
is now considered the gold standard (nominally ready for release / documentation / tutorials) - Modules must conform to
Botan.Types.Class
to be gold standard
These modules have been updated to the current gold standard:
- Botan.BlockCipher.Class
- Botan.BlockCipher.AES
- Botan.Cipher.Class
- Botan.Cipher.ChaCha20Poly1305
These typeclasses are still being worked on to integrate with Botan.Types.Class
, but work at this point is mostly mechanical.
- Botan.Hash.Class
- Needs to be updated to use the
Digest
/HasDigest
/GDigest
inBotan.Types.Class
- Needs to be updated to use the
- Botan.MAC.Class
- Ditto
SecretKey
/HasSecretKey
/GSecretKey
- Ditto
Digest
/HasDigest
/GDigest
- Ditto
- Botan.OneTimeAuth.Class
- Ditto
SecretKey
/HasSecretKey
/GSecretKey
- Ditto
Nonce
/HasNonce
/GNonce
- Ditto
Digest
/HasDigest
/GDigest
- Ditto
These typeclasses still need to be developed and are considered critical for first release:
- Botan.PubKey.Class
- Botan.PubKey.Encrypt.Class
- Botan.PubKey.Sign.Class
Other typeclasses still need to be developed, but are not considered critical for first release as their concrete interfaces currently suffice.
- Password hashing (for Bcrypt, etc)
- FPE
- HOTP / TOTP
- KDF
- Key Wrapping
- KeyAgreement / KeyExchange / PAKE (for SRP6, DHKE)
- ZFEC
We’re delaying our hackage release candidate by a week, maybe two, but its worth it, I promise. Slow is smooth, and smooth is fast. I want this library to be smooth.
As always, this has been pushed to the repo .
It really feels like the kind of abstraction problem you’re trying to solve between having an abstract interface of crypto algorithms with different backends such as botan is a perfect fit for backpack…
Someone’s already done that: raaz: Fast and type safe cryptography.
I’m actually surprised this didn’t make the prior art of the tech proposal.
It certainly helps to have every gear polished and lubricated, ready to fit into place…
Oooh, I don’t know how I missed raaz
, but I’mma have to give it the ol’ look-see-and-digest for a bit - I’m sure there’s things to learn from it.
Small aside, providing different backends for implementing the same algorithm, and providing typeclasses for implementing similar but separate algorithms, are two distinct goals. They do work together quite nicely, however, as they are related.
Update: More gold-standard modules, imminent Hackage release
Hello, all! It is time for another update, but first:
I must apologize, for I have fallen prey to my tendency to keep my eyes on the horizon instead of on the finish line, and have ended up moving my own goalposts. Truth be told, botan-bindings
and botan-low
passed the finish line some time ago, and I really should have pulled the trigger on publishing them already.
This is where I really appreciate the community framework; weekly meetings with José have been helping me to keep my updates more regular, and also keep me on track and appraised of any concerns. Frankly, I don’t want to be a silo’d developer, and communication is an essential part of that.
As such, a minor course correction: I’m switching focus, and the goal of the next update is going to be the publishing of botan-bindings
and botan-low
to hackage as a candidate package (alongside the third monthly status report, of course, which is also now due! Egads!).
I am a bit anxious, never having published a package to hackage before. I look at the project and its very easy for me to forget what I’ve done, and only see what still needs to be done, or could be done - perfectionist’s disease. Its fine, it doesn’t have to be perfect, it just has to be sufficient for initial release, and it is. We can still keep working on it after we publish.
There’s a few things that must happen before publishing:
First, according to hackage requirements, I need someone to endorse me, so if a fine citizen or two could assist me in this, I would appreciate it.
Secondly, there’s the minor question of pkgconfig-depends
vs extra-libraries
. Right now, the stanzas in botan-bindings.cabal
look like this:
library
...
includes:
botan/ffi.h
if os(windows)
extra-libraries: botan-3
else
pkgconfig-depends: botan-3 >= 3.0.0
...
This has been working so far, but I want to ensure is configured properly, since it is how we link to the Botan C++
library. There are more than a few different ways of handling this, spread over several discussions:
- What's the benefit of pkgconfig-depends?
- Allowing pkg-config to fail · Issue #6771 · haskell/cabal · GitHub
- Backtrack when no pkgconfigdb is present by gbaz · Pull Request #7621 · haskell/cabal · GitHub
I’d like to handle it properly - if I’ve read the discussions correctly, we can crank the cabal version dependency up and improve it. I’d appreciate any suggestions here.
Thirdly, I need to go over botan-bindings
and botan-low
with a fine-toothed comb and do a few things like establish dependency version constraints to make the package acceptable to hackage. I’ll be on this for the next few days, and then hopefully upload it as a candidate on Monday.
So, time to get it done!
Now, back to this update.
- The following modules have been completed to gold-standard, and conform to
Botan.Types.Class
:- Botan.BlockCipher
- AES
- ARIA
- Blowfish
- Camellia
- CAST
- DES
- GOST
- IDEA
- Noekeon
- SEED
- Serpent
- SHALCAL
- SM4
- Threefish
- Twofish
- Botan.BlockCipher
- I’ve added a temporary BlockCipher128 typeclass while a proper blocksize constraint is developed.
- There have been improvements to error reporting, and the last exception message is now attached to any thrown Haskell exceptions.
- I’ve laid out a Botan.Easy module that exposes a
saltine
-like interface of recommended algorithms
That’s all for now, it’s been pushed to the repo. We’ll be back on this stuff after we publish to hackage.
pkg-config
I strongly recommend pkgconfig-depends
, if botan upstream publishes .pc
files. The main reason is that it means botan developers are deciding what the link flags are, rather than all their library consumers making a best guess. It also means you’re declaring a dependency on a native package instead of linker flags (which are more of an implementation detail), which seems more like the correct level of abstraction.
I would hope that you’d be able to use pkg-config on Windows but I haven’t played in that space for many years. Doesn’t GHC come with some sort of MSYS setup for this sort of thing?
Yes, pkg-config works on windows (or at least it is not any worse than other options). See my experience report: Installing a library with C dependencies on Windows