Botan bindings devlog

PubKey progress, Post-quantum support, and Better versioning

Today’s update focuses on two related things - public key infrastructure, and post-quantum algorithms! The former is basically the same thing we’ve been doing with Hash and MAC and Cipher - upgrading it to the new BotanObject, bringing in algorithm data types, adding more convenient interfaces! I won’t bother pasting everything, just the highlights, because there’s a lot.

PubKey improvements

First off, the new data type for public key (PK) algorithms:

data PKType
    = RSAType
    | SM2Type
    | ElGamalType
    | DSAType
    | ECDSAType
    | ECKCDSAType
    | ECGDSAType
    | GOST_34_10Type
    | Ed25519Type
    | Ed448Type
    | XMSSType
    | DHType
    | ECDHType
    | X25519Type
    | X448Type
    | DilithiumType
    | ML_DSAType
    | KyberType
    | ML_KEMType
    | McElieceType
    | ClassicMcElieceType
    | FrodoKEMType
    | HSS_LMSType
    | SphincsPlusType
    | SLH_DSAType
    deriving stock (Show, Eq, Ord, Enum, Bounded)

You may note that the constructors are suffixed with -Type unlike our earlier primitives - this is because PKI (public key infrastructure) isn’t handled the same as primitive hashes, ciphers, etc. They usually combine the algorithm and any parameters into a single argument, but PK keeps the algorithm and the parameters as separate arguments - thus, we have a separate PKScheme type:

data PKScheme
    = RSA Word32
    | SM2 ECGroup
    | ElGamal DLGroup
    | DSA DLGroup
    | ECDSA ECGroup
    | ECKCDSA ECGroup
    | ECGDSA ECGroup
    | GOST_34_10 ECGroup
    | Ed25519
    | Ed448
    | XMSS XMSSParams
    | DH DLGroup
    | ECDH ECGroup
    | X25519
    | X448
    | Dilithium DilithiumMode
    | ML_DSA DilithiumMode
    | Kyber KyberMode
    | ML_KEM KyberMode
    | McEliece Word32 Word32 -- n, t -- TODO: Make a McEliece parameter type
    | ClassicMcEliece ClassicMcElieceParams
    | FrodoKEM FrodoKEMMode
    | HSS_LMS ByteString -- TODO: Make a HSS-LMS parameter type
    | SphincsPlus SphincsPlusMode
    | SLH_DSA SLH_DSAMode
    deriving stock (Show, Eq)

There are also data types for DLGroup, ECGroup, alg-specific params (I’ve omitted the constructors for brevity):

data ECGroup
data DLGroup
data DilithiumMode
data KyberMode
data ClassicMcElieceParams
data FrodoKEMMode
data SphincsPlusMode
data SLH_DSAMode
data XMSSParams

There is of course a function to get the suggested scheme:

pkSuggestedScheme :: PKType -> PKScheme
pkSuggestedScheme RSAType             = RSA 3072
pkSuggestedScheme SM2Type             = SM2 Sm2p256v1
pkSuggestedScheme ElGamalType         = ElGamal MODP_IETF_2048
pkSuggestedScheme DSAType             = DSA DSA_BOTAN_2048
pkSuggestedScheme ECDSAType           = ECDSA Secp256r1
pkSuggestedScheme ECKCDSAType         = ECKCDSA Secp256r1
pkSuggestedScheme ECGDSAType          = ECGDSA Brainpool256r1
pkSuggestedScheme GOST_34_10Type      = GOST_34_10 Gost_256A
pkSuggestedScheme Ed25519Type         = Ed25519
pkSuggestedScheme Ed448Type           = Ed448
pkSuggestedScheme XMSSType            = XMSS XMSS_SHA2_10_512
pkSuggestedScheme DHType              = DH MODP_IETF_2048
pkSuggestedScheme ECDHType            = ECDH Secp256r1
pkSuggestedScheme X25519Type          = X25519
pkSuggestedScheme X448Type            = X448
pkSuggestedScheme DilithiumType       = Dilithium Dilithium6x5
pkSuggestedScheme ML_DSAType          = ML_DSA ML_DSA_6x5
pkSuggestedScheme KyberType           = Kyber Kyber1024_R3
pkSuggestedScheme ML_KEMType          = ML_KEM ML_KEM_768
pkSuggestedScheme McElieceType        = McEliece 2960 57
pkSuggestedScheme ClassicMcElieceType = ClassicMcEliece ClassicMcEliece_6960119f
pkSuggestedScheme FrodoKEMType        = FrodoKEM FrodoKEM976_SHAKE
pkSuggestedScheme HSS_LMSType         = HSS_LMS "SHA-256,HW(10,1)"
pkSuggestedScheme SphincsPlusType     = SphincsPlus SphincsPlus_SHA2_128_Small
pkSuggestedScheme SLH_DSAType         = SLH_DSA SLH_DSA_SHA2_128_Small

Then, generating a key is as simple as:

prk <- generatePrivKey (pkSuggestedScheme SLH_DSAType)

I would like to note that this only covers key generation - unlike the other cryptographic primitives, a given PK algorithm is only associated with key generation, and the various operations of key agreement / key encapsulation / signatures / encryption require their own additional scheme, which will be the subject of the next update.

Its kind of complicated, not every PK algorithm supports every PK operation (in fact, most support only one or two, and you usually need different algorithms for eg key agreement vs signing vs encryption). In the mean time, I have created a table to help disambiguate the various PK algorithm and their uses - I’m still testing, but the preliminary result is this:

ALG             KA      KEM     Sign    Encrypt PQ      Notes
-----------------------------------------------------------------
- Prime factorization
RSA             Yes     Yes     Yes     Yes
- DL Groups
DH              Yes
DSA                             Yes
ELGAMAL                                 Yes
- ECC
ECDH            Yes
EC*DSA                          Yes
ECIES                                   Yes             Not FFI supported
SM2             Yes     Yes     Yes     Yes
GOST-34.10                      Yes                     Deprecated
- Named curves
X25519          Yes
X448            Yes
ED25519                         Yes
ED448                           Yes
- Post-quantum
MCELIECE                Yes                     Yes
FRODOKEM                Yes                     Yes
KYBER                   Yes                     Yes
ML_KEM                  Yes                     Yes
DILITHIUM                       Yes             Yes
ML_DSA                          Yes             Yes
SPHINCS_PLUS                    Yes             Yes
SLH_DSA                         Yes             Yes
HSS_LMS                         Yes             Yes     Stateful
XMSS                            Yes             Yes     Stateful

Post Quantum support

The keen-eyed among you will already have noticed the increased support for post-quantum algorithms, specifically the support for the recently approved FIPS / NIST final selection of post-quantum algorithms!

  • Key encapsulation
    • McEliece
    • ClassicMcEliece
    • FrodoKEM
    • Kyber
    • ML_KEM (NIST approved Kyber)
  • Digital signatures
    • Dilithium
    • ML_DSA (NIST approved Dilithium)
    • SphincsPlus
    • SLH_DSA (NIST approved Sphincs)
    • XMSS (stateful, use with caution)
    • HSS_LMS (stateful, use with caution)

The definitions for their parameter types is not super interesting, aside from my having to spend a few days* scouring the C++ source code to find their precise formats and arguments, stuff like inconsistent casing really screws with accurately identifying algorithms, etc, but that’s all taken care of now.

I am glossing over a lot here. It was a huge pain in the ass tracking down every algorithms’ specific, inconsistently capitalized capitalization-sensitive algorithm and parameter identifier because they only exist as magic strings in the C++ source code! For example, the algorithm is SPHINCS+ with SHA512 but the params actually need to be formatted as SphincsPlus and sha2, and this is just plain not mentioned anywhere! It’s fricken terrible! And you don’t have to deal with that anymore!

You still need an appropriate version of botan to enable support, however, and that takes us to our final section of the update!

Versioning support

These bindings were originally written against botan 3.2, and in the time since, botan has received several updates - 3.11 is now available, and enough things have been added that I am currently determining how I am going to handle versioning support - which is a problem.

One of the issues with adding support for new algorithms is that I still need to handle botan versions that don’t have them yet - and while the lowest botan-bindings take bytestring names and parameters for algorithm identifiers, now that we have data types for algorithms, I need some way of indicating to the user whether a given algorithm is actually available.

Why is it a problem? Well, botan provides botan/build.h as file to import to gain access to the BOTAN_HAS_<alg> defines, which sounds perfect! Just import it, and use CPP and conditional compilation, right? Something like:

module Botan.Low.PubKey where

import Botan.Bindings.PubKey

-- Like this
#include <botan/build.h>

-- So I could do things like:
pkTypeIsSupported :: PKType -> Bool
#if defined(BOTAN_HAS_RSA)
pkTypeIsSupported RSAType = True
#endif
pkTypeIsSupported _ = False

Did that work? Nope!

It fails because build.h has an indented define that causes the strict GHC C preprocessor to fail with an error - this little line here:

#ifndef BOTAN_DLL
  #define BOTAN_DLL __attribute__((visibility("default")))
#endif

If anyone knows how to allow GHC to parse this without failing, it would be really nice to be able to just import the file. Otherwise, to fix this we need to either pre-parse that file as as pre-build step to fix it or otherwise generate the list of BOTAN_HAS_<alg> supported algorithms for us to consume, which seems like a lot of work for what ultimately is a fragile bandaid - really this should be fixed by Botan C++ itself so I will probably create an issue for them.

Since I can’t include the file directly, for now I have instead resorted to using CApiFFI to import the defines as constants one by one.

-- Since the indented #define doesn't allow us to use the constants directly for
-- conditional compilation, we will import the defines and allow the compiler to
-- elide things via constant folding

-- Prime factorization
foreign import capi safe "botan/build.h value BOTAN_HAS_RSA" botan_has_rsa :: CInt

-- DL
foreign import capi safe "botan/build.h value BOTAN_HAS_DL_GROUP" botan_has_dl_group :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_DIFFIE_HELLMAN" botan_has_diffie_hellman :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_DSA" botan_has_dsa :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_ELGAMAL" botan_has_elgamal :: CInt

-- ECC
foreign import capi safe "botan/build.h value BOTAN_HAS_ECC_GROUP" botan_has_ecc_group :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_ECC_PUBLIC_KEY_CRYPTO" botan_has_ecc_public_key_crypto :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_ECDH" botan_has_ecdh :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_ECDSA" botan_has_ecdsa :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_ECKCDSA" botan_has_eckcdsa :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_ECGDSA" botan_has_ecgdsa :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_SM2" botan_has_sm2 :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_GOST_34_10_2001" botan_has_gost_34_10_2001 :: CInt

-- Named curves
foreign import capi safe "botan/build.h value BOTAN_HAS_X25519" botan_has_x25519 :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_X448" botan_has_x448 :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_ED25519" botan_has_ed25519 :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_ED448" botan_has_ed448 :: CInt

-- Post-quantum
foreign import capi safe "botan/build.h value BOTAN_HAS_MCELIECE" botan_has_mceliece :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_CLASSICMCELIECE" botan_has_classicmceliece :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_FRODOKEM" botan_has_frodoKEM :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_KYBER" botan_has_kyber :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_KYBER_90S" botan_has_kyber_90s :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_ML_KEM" botan_has_ml_kem :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_DILITHIUM" botan_has_dilithium :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_ML_DSA" botan_has_ml_dsa :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_HSS_LMS" botan_has_hss_lms :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_SPHINCS_PLUS_WITH_SHA2" botan_has_sphincs_plus_with_sha2 :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_SPHINCS_PLUS_WITH_SHAKE" botan_has_sphincs_plus_with_shake :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_SLH_DSA_WITH_SHA2" botan_has_slh_dsa_with_sha2 :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_SLH_DSA_WITH_SHAKE" botan_has_slh_dsa_with_shake :: CInt
foreign import capi safe "botan/build.h value BOTAN_HAS_XMSS_RFC8391" botan_has_xmss_rfc8391 :: CInt


-- TODO: Also use has_dl_group, has_ecc_group
pkTypeIsSupported :: PKType -> Bool
pkTypeIsSupported X25519Type = botan_has_x25519 > 0
pkTypeIsSupported X448Type = botan_has_x448 > 0
pkTypeIsSupported RSAType = botan_has_rsa > 0
pkTypeIsSupported McElieceType = botan_has_mceliece > 0
pkTypeIsSupported ClassicMcElieceType = botan_has_classicmceliece > 0
pkTypeIsSupported FrodoKEMType = botan_has_frodoKEM > 0
pkTypeIsSupported KyberType = botan_has_kyber > 0 || botan_has_kyber_90s > 0
pkTypeIsSupported ML_KEMType = botan_has_ml_kem > 0
pkTypeIsSupported DilithiumType = botan_has_dilithium > 0
pkTypeIsSupported ML_DSAType = botan_has_ml_dsa > 0
pkTypeIsSupported HSS_LMSType = botan_has_hss_lms > 0
pkTypeIsSupported SphincsPlusType = botan_has_sphincs_plus_with_sha2 > 0 || botan_has_sphincs_plus_with_shake > 0
pkTypeIsSupported SLH_DSAType = botan_has_slh_dsa_with_sha2 > 0 || botan_has_slh_dsa_with_shake > 0
pkTypeIsSupported XMSSType = botan_has_xmss_rfc8391 > 0
pkTypeIsSupported Ed25519Type = botan_has_ed25519 > 0
pkTypeIsSupported Ed448Type = botan_has_ed448 > 0
pkTypeIsSupported ECDSAType = botan_has_ecc_public_key_crypto > 0 && botan_has_ecdsa > 0
pkTypeIsSupported ECKCDSAType = botan_has_ecc_public_key_crypto > 0 && botan_has_eckcdsa > 0
pkTypeIsSupported ECGDSAType = botan_has_ecc_public_key_crypto > 0 && botan_has_ecgdsa > 0
pkTypeIsSupported SM2Type = botan_has_ecc_public_key_crypto > 0 && botan_has_sm2 > 0
pkTypeIsSupported GOST_34_10Type = botan_has_ecc_public_key_crypto > 0 && botan_has_gost_34_10_2001 > 0
pkTypeIsSupported DHType = botan_has_diffie_hellman > 0
pkTypeIsSupported ECDHType = botan_has_ecdh > 0
pkTypeIsSupported DSAType = botan_has_dsa > 0
pkTypeIsSupported ElGamalType = botan_has_elgamal > 0

This isn’t perfect - instead of conditional compilation, we have to rely on constant folding to eliminate dead branches, which isn’t ideal, but at least now we can check whether algorithms are properly supported without having to attempt generating a key or context and then catching a NOT_IMPLEMENTED exception.

Anyway I have spent a few days testing against various versions of botan, and it is extremely satisfying to be able to print out and verify what algorithms are supported by this installation, and see it change when I change installation versions. I intend to do the same thing for the other modules in the future, adding versioning support.

Another thing is that botan 4 is coming with a current ETA of 2027, which will be the first major version jump we have to handle, so it is better to get on this sooner rather than later - we should be able to absorb any changes due to the ergonomics refactor, since we no longer need to stay 1:1 with the bindings. I don’t expect huge changes to the FFI, though several algorithms are slated to be removed. Good thing I’m adding versioning support now, rather than later.

Additional miscellanea

I am also looking into the view functions that have been added in 3.5 - they are designed to help avoid some of the song-and-dance routine required to give some algorithms the right-sized buffers by instead giving you access to the buffer and making you copy it yourself.

On the main branch, someone opened a PR to fix some base64 decoding for 3.12, so Joris has merged a PR, and new minor versions are on their way!

Health

My health has continued to improve, so much that I have recently broken a personal record / hit a milestone and was able to walk 12 miles (a half-marathon!) in a single day :partying_face:

One year ago, I was having to consider getting a cane because of my leg, but this weekend, I was able to jump properly on it for the first time in a very long time. I have only been able to regain my health, because this project and your support affords me the time I need to focus on it - I can work on my PT while mulling over problems, and I don’t have the stress of a boss clocking my time and forcing me to sit down for 8-10 hours a day.

That means working on this project is a joy that gives me energy, rather than one that drains me - and that keeps these updates keep coming!


That’s all for now!

18 Likes