Thanks for the updates!
Today’s update is small, but concerns an important thing. Up until now, I’ve been relying on the garbage collector to clean up references at its leisure, but it is good security to destroy sensitive objects as soon as you are done with them. I have also implemented FPE because it was low-hanging fruit.
Changes:
- Created immediate destructor functions for objects eg
cipherDestroy
- Created functions using
bracket
that perform cleanup immediately (or as immediately as “The resource will be released as soon as possible” means), egwithCipher
. - Implemented Botan.FPE module (Format Preserving Encryption), (untested)
The repo has been updated.
In contrast to the previous minor update, this one is quite sizable I wanted to get at least the direct bindings / foreign calls done, and on that I feel I succeeded. In fact, I got so much done, that it is now easier to talk about what is left to do.
There are a scattering of functions that are still missing foreign calls and implementations, mostly related to X509, ZFEC, as well as the optional view functions, and a bunch of missing algorithm name constants - but otherwise, the the lowest-level raw bindings are about 95% complete, and the low-level non-idiomatic-Haskell layer is about 80% complete, pending those things.
Changes are:
- Implemented loading functions for Diffie-Helmann, DSA, ElGamal, and RSA public and private keys
- Implemented most of the remaining functions in Botan.PubKey, (failed) attempts to get view functions working
- Added documentation to Botan.Hash, Botan.Random, made Botan.Random more consistent, minor notes and cleanup
- Implemented Botan.PubKey.Ed25519 module
- Implemented Botan.PubKey.KeyAgreement module
- Implemented Botan.PubKey.KeyEncapsulation, allocBytesWith function
- Implemented Botan.PubKey.X25519 module
- Minor documentation / organization / missing foreign calls
- Implemented Botan.SRP6 module
- Started implementing Botan.X509 module, implemented foreign calls and some functions
- Started implementing Botan.ZFEC (non-exposed, zfecEncode works but zfecDecode causes segfaults)
- Added stub modules for Botan.PBKDF, Botan.Scrypt which contain only deprecated-because-generalized functions
There’s still a lot to do, but this library is rapidly approaching a level of completion and consistency that will soon be worth versioning and publishing to hackage, even if it is just the lower-level bindings!
Today’s update is fairly simple but sizeable. It consists mostly of two things - finishing up most of the X509 functions, and a lot of reorganizing under the hood. As part of final prep / shakedown for the first proper version release, I’ve split the library into multiple pieces, to keep raw bindings, low-level, and high-level implementations separate. That way, if you want to use the bindings but disagree with some of the higher-level implementations, you are free to use the lower-level libraries and implement your own!
Changes are:
- Finished implementing Botan.X509 (mostly)
- Split into multiple libraries
-
botan-bindings
for raw ffi -
botan-low
for low-level, unidiomatic ByteString + IO interface -
botan
for high-level idiomatic interface with type safety / referential transparency
-
- Made some constant names a lot friendlier in Botan.Low
Remaining -bindings and -low issues are mostly algorithm constants and a few scattered functions, and a thorough testing to ensure that functions are implemented correctly. I’ve also been working on getting zfecDecode
remaining, but it is proving to be finnicky as its output parameters are ill-described and involve either a contiguous 2D array, or an array of arrays, and it isn’t clear what the layout is.
The repo has been updated.
It’s update time again!
Today’s update is a bit of documentation,
-
botan-bindings
has been more or less documented, first pass.- Each function’s documentation should at least contain its C declaration
- Most functions have documentation cloned from the Botan C FFI header file
- The correctness of the documentation is sometimes suspect, as some of it
is taken from or inferred from the C++ documentation.
- The correctness of the documentation is sometimes suspect, as some of it
-
botan-low
has had the module-level documentation added, and a few modules have been documented
In addition, we have the first sneak peek at the botan
high-level bindings!
- Implemented
Botan.Prelude
,Botan.Error
,Botan.Utility
,Botan.Version
high-level idiomatic bindings - First stab at implementing
Botan.Hash
with high-level API design- Implemented ‘pure / referentially transparent’ versions of hash functions
- Implemented
HashSpec
data type - Implemented prototype
Hash
andIncrementalHash
classes with data families - Can use the
HashSpec
data type with the genericHashCtx
andHashDigest
, or can use the data families and get strongly-typedCtx a
andDigest a
. - Implemented MD5 as a hash / data family
The naming is a mess, and any of it may be changed wildly at any moment since I’m still shaking it down, but here’s what I’ve got:
The first nice thing is that there is now a HashSpec
datatype for supported hash algorithms, based on Z-Botan
's HashType
as a reference (and saving me a lot of effort). This is of course far more convenient than remembering algorithm names yourself.
data HashSpec
-- Cryptographic hashes
= Blake2b Blake2bSize
| Keccak1600 Keccak1600Size
| MD4
| MD5
| RIPEMD160
| SHA160
| SHA224
...
-- remainder omitted
hashCtxInitWithSpec :: HashSpec -> HashCtx
hashCtxInitWithSpec = hashCtxInit . hashSpecName
hashWithSpec :: HashSpec -> ByteString -> HashDigest
hashWithSpec spec = hashWithHashCtx (hashCtxInitWithSpec spec)
I’m thinking I might push the HashSpec down to botan-low
, keeping the monotyped functionality together and leaving the polymorphic types for the higher-level botan
, and I’ll be doing the same for other cryptographic primitives in the future after I decide upon a consistent handling, but this is a big leap forward in usability.
The second nice thing is the data families interface / API, which I adapted from a prior experimental project investigating different ways of class-ifying cryptographic primitives. It’s not quite the same as cryptonite
, but I’ve found the data families + newtypes approach to be the most effective, albeit at the cost of some up-front boilerplate for the newtypes. It’s nice, because it enables type-applications to control the algorithm selection.
-- The data families and classes
data family Ctx a
data family Digest a
class Hash a where
hashWithCtx :: Ctx a -> ByteString -> Digest a
class (Hash a) => IncrementalHash a where
hashInit :: Ctx a
hashUpdate :: Ctx a -> ByteString -> Ctx a
hashUpdates :: Ctx a -> [ByteString] -> Ctx a
hashFinalize :: Ctx a -> Digest a
-- MD5 implemented as an example
data MD5
newtype instance Ctx MD5 = MD5Ctx
{ getMD5Ctx :: HashCtx }
newtype instance Digest MD5 = MD5Digest
{ getMD5ByteString :: ByteString }
deriving newtype (Eq, Ord)
type MD5Ctx = Ctx MD5
type MD5Digest = Digest MD5
instance Show (Digest MD5) where
show :: Digest MD5 -> String
show (MD5Digest bytes) = Text.unpack $ hexEncode bytes Lower
instance Hash MD5 where
hashWithCtx :: Ctx MD5 -> ByteString -> Digest MD5
hashWithCtx (MD5Ctx ctx) bytes = MD5Digest $ hashWithHashCtx ctx bytes
instance IncrementalHash MD5 where
hashInit :: MD5Ctx
hashInit = MD5Ctx $ hashCtxInit "MD5"
hashUpdate :: MD5Ctx -> ByteString -> MD5Ctx
hashUpdate (MD5Ctx ctx) bytes = MD5Ctx $ hashCtxUpdate ctx bytes
hashUpdates :: MD5Ctx -> [ByteString] -> MD5Ctx
hashUpdates (MD5Ctx ctx) chunks = MD5Ctx $ hashCtxUpdates ctx chunks
hashFinalize :: MD5Ctx -> MD5Digest
hashFinalize (MD5Ctx ctx) = MD5Digest $ hashCtxFinalize ctx
hash :: (IncrementalHash a) => ByteString -> Digest a
hash = hashFinalize . hashUpdate hashInit
It’s still a little in flux, and the Hash
and IncrementalHash
classes aren’t necessarily set in stone (for instance, the hashInit
could belong to Hash
or even a separate class entirely) but I think you get the gist. Let me know if you think this is unnecessarily complex compared to cryptonite
's simple, non-data-family- Digest a
and such - I want to get some design feedback before investing a ton of effort into doing it for all of the primitives.
Also, note that the HashSpec
enum doesn’t interfere with our higher-level classy bindings, as the enum is simply a listing of botan-supported algorithms. We are free to define our own cryptographic schemes using the data families, and the high-level API might be split off onto its own in the future, in a sort of mirror symmetry to splitting off botan-bindings
in the other direction.
That’s it for now. My next concrete goal is to create -Spec
data types for other cryptographic algorithms & primitives, while I play around with the higher-level interfaces. That plus documentation should render the botan-low
library more-or-less complete (minus ZFEC), though it may undergo nomenclatural overhaul.
I’ve noticed that my update pace has slowed down a little. This isn’t unexpected - after all, I have been giving updates every day or other for the last month, and while I’ve made it way further than I expected in that time, it has also been a pretty hefty pace to sustain. I feel I need to recoup some energy, and so I might take a bit of a break once I achieve the next major milestone; if this thread gets a bit quiet, don’t worry - I’m not going away, just taking a breather
Your pace has been astounding - a rest is well-deserved! Thanks again for the posts!
A thought: your three different libraries might be a good fit for the multiple public libraries feature in cabal, since I suspect they will probably always be released in lockstep. It also gives you a bit more freedom to rearrange things without having to use more of the Hackage package namespace.
The problem is that afaik stack
doesn’t support them, so if we want to eventually start using botan
in packages that currently use crypton
(or make crypton
use botan
), it’s a no-go.
Hey all, I’m back after a nice break, and I’ve got an update! Its a bit messy - one big commit, because there was a bunch of refactoring, and the intermediate stages weren’t viable.
Here’s what’s changed:
-
The library hierarchy has been refined slightly, to better reflect each library’s intent.
There is now:
-
botan-bindings
- Raw FFI bindings -
botan-low
- Low-level ByteString / IO interface -
botan
- High-level idiomatic interface -
crypto-schemes
- High-level backend-agnostic abstract cryptography interface -
crypto-schemes-botan
- Botan backend forcrypto-schemes
- I am also considering future additional libraries:
-
botanite
- Frontend for drop-incryptonite
compatibility (ifcrypto-schemes
proves insufficient) -
botanium
- Alibsodium / saltine
-like frontend with preselected algorithms for ultra-simple use.
-
-
-
The
*.Hash
modules are the best example of a complete vertical stack, frombotan-bindings
all the way tocrypto-schemes-botan
. -
botan-bindings
andbotan-low
are now effectively (though not perfectly) stable:- there are still a few missing minor functions, and ZFEC is probably still broken.
- Some patterns have been removed from
botan-low
, just use theBOTAN_...
constants from thebotan-bindings
layer for now.
-
Some nomenclature has been refined
-
Context objects like
Hash
andCipher
have been renamed to reflect their nature- Eg,
Hash
→HashCtx
, andhashInit
→hashCtxInit
- This has been done to facilitate ergonomic naming in higher-level libraries
- the
-Spec
naming format has been dropped, egHashSpec
is now justHash
- the
- Value objects like
PrivKey
andMP
have not been renamed.
- Eg,
-
botan-low
function names were suffixed with IO, typealiases added for algo names. -
New modules
PubKey.ECDH
PubKey.ECDSA
andPubKey.SM2
split off fromPubKey
-
Algorithm-specific functions moved from
PubKey
to algorithm-specific modules -
Random
was renamed toRNG
for consistency with Botan. -
The
PubKey
module has not yet been, but may be renamedPK
for consistency in the future, or its submodules moved fromPubKey.*
toPK.*
. -
Some lazy IO has been possibly fixed
- Some functions had laziness that I suspect was causing intermittent INSUFFICIENT_BUFFER_SPACE exceptions
- This may have been the cause of earlier ZFEC implementation failures, though the implementation now is probably incorrect due to experimentation.
-
Data types for algorithms are now available in the
botan
library-
botan-low
init-with-name functions have been renamed to...InitName...
-
botan
now provides...Init
functions which take an algorithm data type - I’ve tried to make it ergonomic, and split off families as sub- data types.
- Objects can now be initialized with data types like
cipherModeInitIO (Cipher (CBC (AES AES256) PKCS7))
instead of with strings likecipherModeInitNameIO "AES-256/CBC/PKCS7"
- Objects can now be initialized with data types like
-
-
Support for a bunch of newer (eg, post-quantum) algorithms has been added
Dilithium
Kyber
McEliece
-
SPHINCS+
is not working, but I’ll figure out what params it needs.
-
I am figuring out the ‘padding’ algorithm names (EME and EMSA) though there are still some questions and so naming is awkward. This will be refined
I’ve also been digging more into the Botan source, and have found that the FFI bindings have a few gaps.
-
The stream cipher capability is not exposed in botan FFI. It turns out that Z-Botan has some custom code in
cbits/
that replicates this functionality. -
FFI APIs for X.509 are insufficient at the moment. “For whatever reason we expose CRLs as a type, but they are barely usable. You can parse a CRL, but can’t even properly examine it, verify it, etc.”
See: FFI APIs for X.509 are insufficient · Issue #3627 · randombit/botan · GitHub
-
There is also X509- and TLS- related code in Z-Botan’s
cbits/
. -
Some contexts cannot have their state copied (eg, Cipher*) and so making them referentially transparent is tricky. They do have a ‘clear’ operation, though, so as long as I only expose end-to-end operations and reset them at the end, they can be reused.
* Because things with a nonce… should only be used once…
(Linear types may be of interest here.)
-
Some algorithm parameters are referred to as just ‘padding’. I believe that there is EME for encryption, and EMSA for signing, but there may be some overlap.
-
A bunch of functions (especially PK-related) take an unused flags parameter, and we can just pass in 0.
-
There is no constant for some things, especially default flags.
This is all stuff I’ll be focusing on in the future.
Between the Z-Botan
source code, the FFI docs and source code, and the C++ docs and source code, I am slowly figuring it all out. botan-bindings
and botan-low
are now effectively (though not perfectly) stable, and the botan
library is beginning to take shape.
The repo has been updated.
I have enjoyed working on this set of libraries over the past month, having had the temporary luxury of focusing on it more or less full-time, and I would quite like to continue doing so on a more long-term basis.
There is still quite a lot of work to get everything stable and to write unit tests and tutorials and such, and it will still require the effort of several months before everything is production-grade quality.
However, I do not have the runway for that - at least, not without help.
I’ve worked on this for the last month, putting myself out here as a show of good faith. I would like to bring to the community a proposal for what I would like to achieve over the next three months, and discuss the means of funding it. If I have community support, I can continue to focus on this project indefinitely (or at least until it is done), and I would very much like to.
Please, let me know what you think of this, and in the meantime, I will get to writing the proposal and setting up an official project page.
It’s a tough one; I faced this dilemma with amazonka
as well, where I really did not feel good about dropping another 150+ new service packages onto Hackage. At what point is it fair to say that stack
just needs to get with the program? The initial work on multiple public libraries was done for a 2019 GSoC project, and then major improvements came through in cabal 3.8.1.0, released about a year ago. This isn’t a brand-new feature.
I have been following along this particular tangent, considering the recent proliferation of the various layers of libraries I am not sure yet what I will do, but for now it is all in flux still.
Today’s update is minor, but it pleases me greatly. I have fixed the ZFEC implementation, and we now have working forward error correction - that is, we can split data up into n
shards such that only k of n
shards are required to reconstruct the original data. It is useful in situations where you might lose packets during an unreliable stream of data, or while stored on a distributed network.
The changes are:
- Fixed and completed
Botan.Low.ZFEC
andBotan.ZFEC
- Implemented most of
Botan.MAC
With ZFEC implemented, all of the botan-low
modules are functioning, and with Botan.MAC
, we have established the pattern that we will use throughout the rest of botan
.
These changes have been pushed to the repo
how far is this in comparison to zbotan? I’m curious if I could use this to replace the crypton stuff in mysql-haskell.
It is a little hard to compare them - in some ways it is still behind Z-Botan
but in others it has already surpassed it. It depends on what you need, and what you’re willing to put up with.
My bindings expose a slightly larger set of functionality, but it is currently a bit rough to use as I am still working on the ergonomics and there are a few issues here and there*. On the other hand, Z-Botan
keeps a lot of functions in IO
**, and is a fairly faithful implementation, so there’s not much distance between it and botan-low
+ botan
if you import both.
Right now I’d continue to use what you are using, as a lot of my code isn’t rigorously tested yet and I am still working on adding a purer libsodium
-like interface plus crypton
-like classes and standalone algorithm data types, but once that is all done there shouldn’t be any major functionality missing compared to Z-Botan
or crypton
(pending a few C++ shims), and I’d totally say go for it.
* major ones being lack of X509 Cert store, TLS support which I need to shim / patch with C++ and have been experimenting with such for the last few days It does look like
mysql-haskell
does need the cert store stuff, so you’ll have to wait until I specifically get that in.
** Providing a pure layer over botan poses some challenge because the context objects are not-copy-able, by design, and this interferes with referential transparency unless we basically treat them as linear types. It is something I have spent the last few days furrowing my forehead over.
This also explains why botan only has RNG-as-in-entropy APIs and not deterministic PRNG / DRBG - the one deterministically seedable DRBG is an edge case with a warning and is not exported to the FFI.
To all: I have been experimenting with a few things, and will be publishing another update soon.
Hello all! I’m sorry for the silence, I had a rather unpleasant interaction that ended up wasting a rather lot of my time, and it has taken me a few days more to get back into the swing of things. It is really good to be back on this, instead - and I have quite the update for all of you.
A lot of it is experimental, and I’ve been sort of shaking things to see what’s loose - its hard to design cryptography APIs, and the first draft isn’t always the best candidate, even if it does work. Still, I’ve been doing a bit of a rundown of existing Haskell cryptography APIs, to see what we can use, what we can improve, and what we need to write ourselves.
I’m trying to find a comfortable middle ground between classes that describe how cryptographic systems are constructed (low-level, granular), and how they are used (high-level, abstract), and I have made a fair bit of progress as I continue cataloguing and comparing everything.
Here’s what has changed, broken down by library:
botan-bindings
No changes
botan-low
- Minor renaming for consistency / specificity to avoid function role confusion
- May make less verbose again in the future depending on how well library nomenclature aligns
- C++ integration tests for fixing some of the issues / missing functions in the Botan FFI
- Not pushed, separate branch since is more proof of concept right now still
- It works so far though.
- Should be able to use this to get FFI for the missing X509 and TLS stuff
botan
- Minor renaming for consistency
- Work on
Botan.Bcrypt
, notes on safety - Progress on
Botan.Cipher
- currently non-functional - Experimenting with interfaces in
Botan.Random
-
Botan.Mac
mostly completed, still kind of rough - Rough
Botan.HOTP
module created -
Botan.KDF.kdf
function implemented - Work on testing key pair cryptography
- SPHINCS+ algorithm removed temporarily because it isn’t available in this version
- Added
Botan.Padding
module that may be moved tocrypto-schemes
- Am trying to create a hierarchy, padding is used in a few places but they are separate in botan
- Created a small
Botan.Prelude.Parser
parser class that I think I will need- Small parser for in-memory data, avoids reliance on big third party libraries.
- Will need for parsing bcrypt-like cryptographic object and schema identifiers
- May move to
crypto-schemes
(or somecrypto-utilities
library but I am loathe to split off yet another)
botanium
- Created the highest-level
botanium
library with an interface based onlibsodium / saltine
- Stubbed out exported classes functions and data types
- Implemented
Botanium.Auth
as a test
crypto-schemes
- Began work on developing more abstract cryptography classes using
saltine / botanium
as a starting point - Added a
Crypto.ByteVector
data type for sized bytestrings- Supports a well-defined
Bits
implementation unlikeByteString
- Acts slightly differently than an
Integer
due to fixes size and leading zeroes - May be moved to some sort of
crypto-utilities
if necessary.
- Supports a well-defined
- Added
Crypto.Scheme
class for algorithm / scheme identifiers to allow for bcrypt-like encoding / identification of cryptographic objects (still in progress) - Will be developing classes and data families for things like:
- Keys and Digests
- Hashing
- MAC / Auth code
- Ciphers
- Public Key Encryption / Signing
- Padding
botanite / crypton-botan
- I have started investigating the difficulty of using
botan
orcrypto-schemes-botan
to backcrypton
- It does not have 100% 1:1 coverage of everything in
crypton
, but it appears to cover a fair bit of the exported interface, including the most common use cases. -
crypton/ite
internals are another matter though. - It does seem feasible, though some type signatures may change depending on whether we continue to rely on the
memory
package - The goal could be drop-in
crypton
compatibility, or simply a best-match effort to aid in migrating
Community Proposal
I have started sketching out a community proposal for this project. Right now it is mostly scratch notes of things to mention written in the proposal template, but I can now begin putting together a legible first draft.
We will now continue with our regularly scheduled updates
That’s great work!
Regarding the proposal, I have one comment for one thing in particular:
We need type classes, for cryptography, to allow for user-defined instances.
Be very, very wary of this. Cryptonite’s type classes are famous for needing frequent type annotations because inference gets very hard very quick. This is why I decided for Sel
to use the mighty StrictByteString and leave the conversion to the consumers. It might be seen as bare-bones but eliminates a lot of headaches for everyone.
Thanks! It is really coming together now, isn’t it?
A good warning. I have encountered this irritation myself and it has driven much experimentation to see if I can do better. Right now, I am using data families. It is extra boilerplate compared to type families or phantom types, and there is still the occasional need for a type annotation, but the data families make it a lot easier to infer types, and so far it seems to be an improvement over crypton
's phantom types.
Yeah it’s really nice
Now I’m intrigued.
I’d also vote for the crypto library to just give the resulting byte strings, for “layer-4” libraries to then make them into whatever that library needs or what the use case for it is. E.g. like password
turning them into base64
and adding extra information for the password hashes.
It seems more natural for the crypto library to just do the crypto stuff. Other libraries can take up the role of improving the developer experience by adding APIs on top of the crypto library.
For the sake of usefulness I think it’s nice to have hexadecimal encoding and decoding handled for you by the library, however