Botan bindings devlog

@Axman6 I am down for a video chat to discuss your needs in depth - just send me a private message and we’ll pick a time.

I have a lot to say in response to all this but saying it all would take time that I need to spend getting botan updated / released. So you get fragmented notes in the name of expediency :slight_smile: I hope they are not too insensible.

  • It is brave of you to be testing out the high-level interfaces when they are unstable

    • Super glad you are interested
    • I am trying to keep this in mind, but when I push the release (cross your fingers that its within a day or two) things may change (for the better I hope!)
    • The focus has been on getting everything working in at least one way - then, we can deal with different interfaces
  • Data families vs witness / proxy

    • I don’t mind providing more than one interface, I just happen to like data families
      • Am planning on interfaces that also take a alg value witness or Proxy alg as an alternative, like crypton/ite
    • Most other cryptography operations have multiple types that must work together so data families really works for them
    • Witnesses are simple when you only have a single operation to apply them to
      • Witnesses get hard when you have to apply them to every function in a multi-step operation
      • Or when you need multiple or compound algorithms
    • Your specific use case is the classic weak spot for data family type inference, eg when:
      • Inferring from the return type as the only mention of the actual type as you cited
      • Inferring a type that is not visible to the outer scope (eg, digestBytes $ hash "foo")
    • Hashing only has one related type, which is also produced as an output
      • This means hashing in particular doesn’t work as well with data families, but it does work well with witnesses
      • Being consistent overall is worth the tradeoff though
      • We can still expose a witness / proxy -based interface too so this pain is temporary
  • There is a difference between incremental / online and mutable

    • I’m not surprised you are having issues because I’ve been wanting to improve this are in particular - its confusing and so you should expect a lot of change here in the coming update(s)
    • Incremental is really about online ie chunked processing, I want to reduce it to Strict vs Lazy ByteStrings so a typeclass may then be unnecessary
    • Mutable is really about stateful contexts
      • A crypton/ite -style mutable interface is planned
  • Pure interfaces - only Hash has copyState, so specifically for hashes it is possible to make a stepwise pure interface out of a mutable interface, but for other operations it is not

    • This is actually sensible if you stop to think about it
      • Most cryptographic operations, copying state is highly undesirable or an outright information leak, so they don’t
      • hashing is a rare exception as you may be hashing things sharing the same prefix, so there is a major efficiency gain
      • The same is not necessarily true for other operations eg encryption and signing, especially anything using nonces
    • The botan hash state is mutable, so I am highly averse to pretending that it is pure
      • Need to guarantee that the IO actions are only run once each, and that they are run in the right order
    • Can make a copy of the hash state for every ‘pure’ function, but this is inefficient - but it is possible
    • Can pretend that its pure if we pretend that it is linear (ie, the result is only used once as an argument for the next action)
    • Trying to avoid a class-splosion
    • A similar issue regarding pure vs mutable interfaces can be seen in random with RandomGen vs StatefulGen
  • Regarding memory, I too liked the ByteArrayAccess class and friends, however there are issues that make me hesitant to include it as a dependency

    • The source repository has been archived by the author and is read-only
    • It does not appear to be maintained (and so may warrant initiating a takeover process)
    • botan-low has an absolutely minimal dependency footprint (aside from the botan library itself)
    • However, I have already been considering that it may be worth reviving those classes due to their utility

I hope that is readable and relevant. It will make more sense when I publish the code.

1 Like

[memory] does not appear to be maintained (and so may warrant initiating a takeover process)

It is confirmed that none of Vincent Hanquez’ packages will be maintained, nor has he given an ok for taking it over. He’s ok with anyone forking it and using a new name, so that’s what’s been happening with crypton(ite) and (crypton-)x509 etc.

Xeno’s Paradox Progress

I’ll keep this short; progress toward an update / release is ongoing; I feel as though every update we get about half the remaining work done, but discover more to do. However, an update is good, and I’m trying to not stress about it :slight_smile: It has gotten a bit slow because now I might spend the entire day chasing down one specific operation on one specific algorithm in order to verify that it is well-behaved. The closer you look, the more detail there is to be seen.

Regardless, progress:

  • All the botan modules are now 100% functional-complete
  • Significant reorganization of modules to reduce their number
  • Lots of improvements to PubKey, including post-quantum Kyber / Dilithium algorithm parameters
  • There are proper newtypes for everything, and so type safety is significantly improved
  • Basic unit tests for botan are in progress, and are the major blocker for release now
  • Some massaging / improvements are still needed per catching exceptions and turning them into Either / Maybe when appropriate
    • This bears pondering, as there are some failures where an exception is far more appropriate
    • An example: encrypt is a general function for multiple algorithms; some algorithms have specific handling requirements for plaintext length and chunking, and failure to validate may throw an exception if the function is provided inappropriate arguments - this failure is different from failing to decrypt because of using the wrong key or auth and should return Nothing rather than throwing an exception.
    • This is important because otherwise we have to make a function like encrypt return a Maybe Ciphertext even though encryption really shouldn’t fail the same way decrypt can. This would result in an un-ergonomic interface.
    • I am strongly leaning towards providing validation functions for input, throwing exceptions in the standard functions (and maybe an Either variant that doesn’t), and saving Maybe for functions like decrypt that are expected to fail in specific ways.
  • All of the typeclasses / type family / algorithm-specific modules are still separated out for the moment, will need some minor merging back in, and may not make the initial botan release (as to better focus on the immediate needs)

I’m trying to get this out sooner rather than later, but it will require a minor update to the lower libraries, and I’ll need to make sure I don’t goof that up - I’m thinking I’ll bump botan-bindings and botan-low from 0.0.1 to 0.1.0 because I think there a few minor breaking changes, and release the first version of botan as 0.1.0.

5 Likes

What to do about memory re: crypton/ite?

I’ve been absent for some time due to the need to focus on personal health. It can be a struggle to reach out at times like these, but this last month I’ve not been idle. Herein lie the results of my investigation in the meantime.

In order for botan to provide a palatable alternative to crypton/ite, it is important to understand how people are using it, and what dependencies it implies. One of those dependencies is memory, which supplies a variety of typeclasses and data types regarding low-level pointers and memory. This package, like cryptonite and by the same author, has been archived, and will not be updated in the future.

This is problematic, because even after forking crypton/ite, many libraries still have a transitive dependency on memory. The reason for this is that one of the prominent design choices was to use constraints instead of concrete data types in many of the typeclasses and functions.

For example:

class Cipher cipher where
    ...
    cipherInit :: ByteArray key => key -> CryptoFailable cipher 
    ...
class Cipher cipher => BlockCipher cipher where
    ...
    ecbEncrypt :: ByteArray ba => cipher -> ba -> ba
    ecbDecrypt :: ByteArray ba => cipher -> ba -> ba 
    ...

Thankfully, most algorithm-specific outputs are wrapped in a concrete newtype.

hash :: (ByteArrayAccess ba, HashAlgorithm a) => ba -> Digest a

That does seem to be the general rule, that unstructured blobs of binary data be represented a ByteArray, mostly plaintext / ciphertext / key / salt / password -type data, but it can be slightly onerous when parameters are allowed to be heterogenous

bcrypt
    :: (ByteArray salt, ByteArray password, ByteArray output)	 
    => Int -> salt -> password -> output

As a result, unless one is dealing exclusively with ByteString, and / or the data types supplied by crypton/ite itself, use of memory is almost compulsory. Furthermore, if we wish to provide a similar interface using botan as a backend, this means either accepting memory as a dependency, or doing something about it.

As such, I have spent a lot of time familiarizing myself with its guts and how it is used. I really wanted to know the intent behind many of the decisions the author made (such as how the different allocation / create functions need to be used re: inlining and unsafePerformIO) because it doesn’t do to simply copy and paste. Here’s what I have learned:

The following libraries use both crypton/ite and memory, but there are many more.

  • amazonka-core
  • stack
  • tls
  • password
  • x509

One can look at the reverse dependencies for crypton/ite and memory to see the general cross-section. Thankfully most dependents of crypton/ite found ByteString and the supplied data types sufficient, and so they did not promulgate the issue by depending on memory directly, only indirectly.

So, what is memory?

It is a small, innocuous package, mostly exposing the following classes that allow ByteString and primitive:Data.Primitive.ByteArray-like access to pointers to contiguous memory:

-- Data.ByteArray

class ByteArrayAccess ba where
    length :: ba -> Int
    withByteArray :: ba -> (Ptr p -> IO a) -> IO a
    copyByteArrayToPtr :: ba -> Ptr p -> IO ()

class (Eq ba, Ord ba, Monoid ba, ByteArrayAccess ba) => ByteArray ba where

    allocRet :: Int	-> (Ptr p -> IO a) -> IO (a, ba)

data Bytes

data ScrubbedBytes

-- Data.ByteArray.Sized

class (ByteArrayAccess c, KnownNat n) => ByteArrayN (n :: Nat) c | c -> n where
    allocRet :: forall p a. Proxy n -> (Ptr p -> IO a) -> IO (a, c)

data SizedByteArray (n :: Nat) ba

These classes allow for fast and efficient manipulation of memory, and basically wrap up and classify a lot of low-level functions that GHC uses internally - incredibly powerful, and about as far as you can go without dropping down to using primitive and MagicHash directly.

Notably, the package does not distinguish between pinned and unpinned memory, though the unspoken assumption is that the memory pointed to in withByteArray :: ba -> (Ptr p -> IO a) -> IO a will not be moved for the duration of that operation.

Aside from a few minor quibbles of hierarchy and nomenclature for purposes of consistency with Data.Primitive.ByteArray and Foreign.* and for better separation of bytewise functions vs elementwise functions, it is a useful set of classes. I see no reason to make large changes to it.

So, how is memory being used

I do wish to understand how memory is being used, and so I have gone over many of the direct dependents in order to collate their use cases.

A few particular use cases dominate:

  • ByteArrayAccess and ByteArray classes
  • convert and constEq functions
  • convertToBase functions

Aside from that, there are several errata in the package that we must also consider. Notable:

  • ScrubbedBytes
  • SizedByteArray and ByteArrayN

ByteArrayAccess and ByteArray

By and far, the most-oft used parts of this package are the classes themselves. The major user of these classes is the crypton/ite ecology, and are the core of the package. Enough said - high priority.

convert and constEq

If they aren’t using ByteArray as a general ByteString-like class, directly, chances are they using memory in order to convert between ByteArray-conforming types. This is actually quite sensible, as many cryptography-related packages have some modules / classes / functions for converting various primitives / newtypes to and from bytestrings.

The other common use is to pull in constEq for constant-time equality checks for sensitive information.

Quite often, the memory package is pulled in for just for one or both of these. Between that and the classes themselves, we’ve covered the majority of memory use cases

Base and convertToBase

Seeing occasional use, memory also exposes functionality for converting between various bases in the Data.ByteArray.Encoding module. This is important, but not critical, because there are already packages for converting bytestrings to various bases.

SizedByteArray and ByteArrayN

Seeing infrequent use (eg, NaCl). Essentially a Nat-parameterized ByteArray, it might have been called FiniteByteArray if being consistent with Bits and FiniteBits.

ScrubbedBytes

Seeing infrequent use (eg, NaCl).

memory supplies a ScrubbedBytes class for sensitive memory that protects its Show instance and is automatically zeroed before release. This is a really important but also niche use, and I only observed one or two packages using this functionality.

Here is how the zeroing of memory works for ScrubbedBytes:

getScrubber :: Addr# -> State# RealWorld -> State# RealWorld
getScrubber addr s =
    let IO scrubBytes = memSet (Ptr addr) 0 (I# sz)
     in case scrubBytes s of
            (# s', _ #) -> s'

Essentially, it is using extremely low-level RealWorld magic to dynamically vend a memSet function call in the finalizer of ScrubbedBytes, in an effort to avoid the compiler eliding the zeroing due to the following free (that is, many assemblers will ignore memSet if it is immediately followed by free, because such a memSet is normally considered to be a no-op).

Even if a compiler emits the instructions to zero memory in the compiled binary, that is no guarantee that it will be run, because the CPU may still optimize it out during runtime. As a result, most operating systems provide a specific OS-dependent function for this in order to ensure that the memory is zeroed, and the volatile keyword and memset_explicit function have only recently been properly standardized in C.

I cannot attest to the efficacy of this functionality regarding GHC specifically, but for reference, here is how Botan::secure_scrub_memory works in C++

void secure_scrub_memory(void* ptr, size_t n) {
#if defined(BOTAN_TARGET_OS_HAS_RTLSECUREZEROMEMORY)
    ::RtlSecureZeroMemory(ptr, n);

#elif defined(BOTAN_TARGET_OS_HAS_EXPLICIT_BZERO)
    ::explicit_bzero(ptr, n);

#elif defined(BOTAN_TARGET_OS_HAS_EXPLICIT_MEMSET)
    (void)::explicit_memset(ptr, 0, n);

#elif defined(BOTAN_USE_VOLATILE_MEMSET_FOR_ZERO) && (BOTAN_USE_VOLATILE_MEMSET_FOR_ZERO == 1)
    static void* (*const volatile memset_ptr)(void*, int, size_t) = std::memset;
    (memset_ptr)(ptr, 0, n);
#else

    volatile uint8_t* p = reinterpret_cast<volatile uint8_t*>(ptr);

    for(size_t i = 0; i != n; ++i)
        p[i] = 0;
#endif
}

Replacement of this functionality is considered important, but as a low-level priority compared to the higher-priority functionalities.

Other errara

Outside of that, there are portions of memory that see little to no actual use:

  • Parser and Packer modules
  • Endianness
  • Word128
  • The SipHash and FNVHash modules

I do not think we need to worry about these for the time being.

So, what do we do about it?

I think it would be wise to provide ByteArray-like functionality in botan for better compatibility with crypton/ite and easier migration, and this assessment helps illustrate what capability needs to be preserved, and what can be ignored.

Thus I am considering creating a slimmed down bytearray-classes package, that plucks out the necessary classes and functions in order to allow us to depend on it instead of memory.

We definitely want the bulk of these modules, maybe ignoring the concrete data types and focusing only on the classes:

  • Data.ByteArray
  • Data.ByteArray.Encoding
  • Data.ByteArray.Sized
  • We also probably do want to make some data types a la Bytes and ScrubbedBytes

Using bytearray-classes

Migration would ideally be simple, as I don’t intend on changing the interfaces much, but this actually leads to a rather thorny problem and important decision - where in botan does this library get used? The answer to that is slightly frustating, and one that gives me no small amount of anxiety:

For improving compatibility / ease of migration, the best place to use bytearray-classes and ByteArray constraints like crypton/ite is in botan-low.

The reasoning here is that the concrete ByteString functions can be obtained from the ByteArray functions simply by restricting the types, and we would be able to make all of the internal logic and function generators for allocating ByteStrings much more consistent.

This of course depends on whether people actually prefer a ByteArray constraint over a concrete ByteString parameter at all, as you may find use of TypeApplications suddenly necessary.

5 Likes

Regarding amazonka-core: we only seem to directly use memory for base{16,64} encoding and it might be easy to remove. Issue

Ah, it is not, but I suspect this is a major cause of direct dependencies on memory:

This is not straightforward: the various Digest as that we get from crypton can only be interacted with using the ByteArrayAccess interface from memory, which is why amazonka-core has polymorphic functions for base{16,64} encoding. And if we move to botan we’ll be able to get encoders from there anyway, so there’s little point in doing this refactoring twice.

The password library also doesn’t really need (or at least want) to use memory.

The convert to Bytes we have is only in there because benchmarks showed it is slightly more performant and if we’re depending on memory already (because of crypton(ite)), then why not.
The one we DO want/need is the constEq for security reasons. Though I feel that is something that could also just be added to the bytestring library. Then we’d use that one.

2 Likes

What to do about memory, part 2: Secure memory erasure

Aside from the ByteArray classes, there is another feature that I wish to extract from memory, and that is the ScrubbedBytes, or rather, the functionality that backs it.

What is ScrubbedBytes?

I got into it a bit last post, briefly covering ScrubbedBytes as one of the occasioanlly used features from memory.

To quote the documentation:

ScrubbedBytes is a memory chunk which have the properties of:

    - Being scrubbed after its goes out of scope.
    - A Show instance that doesn't actually show any content
    - A Eq instance that is constant time

Data types like these are used to avoid accidentally leaking sensitive data and / or personally identifiable information through means such as logs and core dump, and any production team worth their salt has a process set up to handle this (usually accompanied by / piped straight into analytics).

Breaking it down

  1. The memory is scrubbed after it goes out of scope.

When all references to a ScrubbedBytes instance are gone, the finalizer runs, and the data will be explicitly zeroed out. This is great! Kind of…

But what does it actually do? Well, if we peek into the source code of newScrubbedBytes, we see that ScrubbedBytes is just an aligned, pinned byte array with a finalizer. The finalizer contains a dynamically-generated scrubber function, which eventually calls memset on the memory region.

ScrubbedBytes.hs:

newScrubbedBytes :: Int -> IO ScrubbedBytes
newScrubbedBytes (I# sz)
    -- Elided zero-size cases
    | otherwise               = IO $ \s ->
        case newAlignedPinnedByteArray# sz 8# s of
            (# s1, mbarr #) ->
                let !scrubber = getScrubber (byteArrayContents# (unsafeCoerce# mbarr))
                    !mba      = ScrubbedBytes mbarr
                in case mkWeak# mbarr () (finalize scrubber mba) s1 of
                    (# s2, _ #) -> (# s2, mba #)
where
    getScrubber :: Addr# -> State# RealWorld -> State# RealWorld
    getScrubber addr s =
        let IO scrubBytes = memSet (Ptr addr) 0 (I# sz)
        in case scrubBytes s of
                (# s', _ #) -> s'
    finalize :: (State# RealWorld -> State# RealWorld) -> ScrubbedBytes -> State# RealWorld -> (# State# RealWorld, () #)
    finalize scrubber mba@(ScrubbedBytes _) = \s1 ->
        case scrubber s1 of
            s2 -> case touch# mba s2 of
                    s3 -> (# s3, () #)

PtrMethods.hs:

memSet :: Ptr Word8 -> Word8 -> Int -> IO ()
memSet start v n = c_memset start v (fromIntegral n) >>= \_ -> return ()

foreign import ccall unsafe "memset"
    c_memset :: Ptr Word8 -> Word8 -> CSize -> IO ()

This attempts to avoid leaking sensitive data in core dumps and memory (I believe it is attempting to use the crossing of FFI / module boundaries in between the memset and free to ensure it is not elided), and is functionality that needs to be preserved, but it’s just calling memset from the C standard library, and it isn’t really exposed in a manner that I am comfortable with, either. In the absence of another mechanism, relying on GHC triggering GC to clean up sensitive memory is just begging for trouble.

Also, this is all done in an attempt to prevent the compiler from eliding the memset because it occurs immediately before a free. Many compilers will automatically remove any writes to a memory region that is about to be freed, which has been a significant thorn in the side of computer security, and has caused countless leaks - there are better options than memset, more on that later.

I would prefer a bracketed withScrubbedBytes -style function over relying on a data type with finalizers, but it is really the secure / guaranteed erasure capability that we are after, and using it in a finalizer is better than nothing.

Notably, this is something that cannot be handled in Haskell - it requires dropping down to C.

  1. A Show instance that doesn’t actually show any content

When we print a value, we don’t want to show sensitive data or PII, so instead we sanitize it. It doesn’t stop you from directly accessing the raw contents and intentionally printing them, but it can save you from accidental exposure, eg if a debug statement makes it into production

This isn’t really specific to any data type or function or even language. Any language with a Show or Printable can override their debug / print / log instances to avoid any sensitive data from being exposed. Thus, preserving this functionality doesn’t require any special effort.

Perhaps instead of requiring the logic in Show, it could be codified in a typeclass SecureShow that provides a printing formatter that automatically elides secure sub-values via Generics or something. However, that requires then that there be a secure logging mechanism that utilizes it, and is a bit far out for the moment.

  1. A Eq instance that is constant time

When we compare sensitive data for equality, the goal is to report whether or not they are equal, and nothing more.

This requires special handling, because if you short-circuit the comparison, how long it takes to compare them may leak information - eg, if we stop comparing at the first byte because they are different, you know something about where in the data the difference occurs, and thus the implementation leaks more information than just ‘is equal or not’.

This is not especially problematic, as we simply need ensure that the entire data is compared before returning the result.

Additionally, constEq is already part of the ByteArray classes, and can be preserved from there.

Replacing memset

Now, with our properties in mind, we can really focus on the big gotcha, which is replacing use of memset. But why? First, some history:

In the beginning, the Universe C standard was created without any concern for data security. This has made a lot of programs very insecure and been widely regarded as a bad move.

Compilers back then were not nearly as smart, and CPU design was much simpler; when you issued a memset call, you could expect that that memory region would be set. Then compilers got smarter, and started optimizing. One of those optimizations is to discard any writes to memory that is about to be freed, and another optimization is to leave the content of malloc uninitialized. You may see where this is going - all of a sudden, data that had been cleared out using memset was reappearing in newly malloced data. This is not a problem if calloc is used, but it is slower, and this issue retroactively affected older code, and we’ve been paying for it ever since.

There have since been developed three general different ways of handling secure memory erasure:

  1. Adding a secure memset variant to the C standard, which is guaranteed not to be elided
  • memset (C89)
  • memset_s (C11)
  • memset_explicit (C23)

memset is the original function we are trying to replace. C11 added the memset_s function for secure erasure.

memset_s() can not be elided: K.3.7.4.1: Unlike memset, any call to the memset_s function shall be evaluated strictly according to the rules of the abstract machine as described in (5.1.2.3). That is, any call to the memset_s function shall assume that the memory indicated by s and n may be accessible in the future and thus must contain the values indicated by c.

Unfortunately, it was both improperly specified, and added as a part of Annex K, which was not a required part of the standard, and thus not widely implemented.

As a replacement for memset_s, C23 specifies the function memset_explicit, which should be preferred in new code. However, C23 is not yet finalized, and thus cannot be assumed to be available yet.

Thus, we should prefer to call memset_explicit if at all possible. I’m not sure whether we should even try to call memset_s, more investigation is needed.

  1. Calling an OS-specific secure memset variant
  • explicit_memset (NetBSD, Oracle)
  • RtlSecureZeroMemory (Windows)
  • explicit_bzero (FreeBSD, OpenBSD, Linux)
  • memzero_explicit (Linux)

Our secondary preference is to call an OS-specific function explicitly intended for securely erasing memory. There are a plethora of these, and we can select which one at compile-time by detecting the operating system.

We should prefer to call these if and only if memset_explicit is not available.

  1. Using volatile function or memory pointers to avoid being optimized out
  • calling memset on the memory region through a volatile function pointer
  • writing to the memory region through a volatile pointer
  • writing an extern value to the memory region

We should fall back on these if and only if no other solution is available. Notably, MacOS / clang uses this fallback in botan, with an accompanying note.

  1. Increasingly wild alternative methods
  • using mmap + MAP_ANONYMOUS + MAP_PRIVATE
  • using a function from a widely-used library (eg OpenSSL, libsodium, botan) which does 1-3 for you

These can’t be relied upon, for various reasons. Also, mmap is POSIX and MAP_ANONYMOUS was just added in POSIX 8.

Doing it properly vs doing what we can

Doing it properly requires deeper support than just C- or ASM-level. Modern CPUs are incredibly complex, with multiple levels of cache to handle, and in some ways, the problem has just moved inside of the CPUs, where the internal microarchitecture may still JIT discard / optimize instructions out of the machine code unless the proper instructions are used. The real solution is for the OS to keep keys / sensitive data / PII inside of a secure enclave, but that’s OS & hardware level, and “Dont ever load it in local memory” is not a solution that is always available.

We can’t handle that here, so the methods described are what we can do at the language level - we do the best to provide a solution for when we DO have sensitive information in local memory.

secure-memory-erase

To that end, I’ve plucked out and stripped down the secure-memory-erase code from botan C++, and have been looking at / comparing it to similar solutions in other libraries like libsodium. Right now it is a single C/C++ header + source file, which exposes a single C function, which is in turn foreign imported by an hsc file.

secure_memory_erase.h:

#pragma once

#include <stddef.h>
#include <stdint.h>

#ifdef __cplusplus
extern "C" {
#endif

void secure_memory_erase(void * mem, size_t bytes);

#ifdef __cplusplus
} // extern C
#endif

secure_memory_erase.cpp:

#include "secure_memory_erase.h"

#include <stdio.h>

#if defined(SME_TARGET_OS_HAS_RTLSECUREZEROMEMORY)

    // Windows: void RtlSecureZeroMemory(void * ptr, size_t n);
    // https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-rtlsecurezeromemory
    #define NOMINMAX 1
    #define _WINSOCKAPI_  // stop windows.h including winsock.h
    #define WIN32_LEAN_AND_MEAN // stop windows.h including lots of things
    #include <windows.h>
    #undef NOMINMAX
    #undef _WINSOCKAPI_
    #undef WIN32_LEAN_AND_MEAN

#elif defined(SME_TARGET_OS_HAS_EXPLICIT_BZERO)

    // Linux, FreeBSD, OpenBSD: void explicit_bzero(void *b, size_t len);
    // https://man.freebsd.org/cgi/man.cgi?query=explicit_bzero
    #include <string.h>

#elif defined(SME_TARGET_OS_HAS_EXPLICIT_MEMSET)

    // NetBSD: void * explicit_memset(void *b, int c, size_t len);
    // https://man.netbsd.org/NetBSD-8.0/explicit_memset.3
    // NOTE: C23 is memset_explicit, *NOT* NetBSD explicit_memset
    #include <string.h>

#endif

namespace Secure::Memory::Erase {

void erase(void * ptr, size_t n) {

    // TODO: If C23, just call memset_explicit like libsodium
    // TODO: If C11 and supported, maybe call memset_s

#if defined(SME_TARGET_OS_HAS_RTLSECUREZEROMEMORY)

    // Windows
    ::RtlSecureZeroMemory(ptr, n);

#elif defined(SME_TARGET_OS_HAS_EXPLICIT_BZERO)

    // FreeBSD, OpenBSD, Linux
    ::explicit_bzero(ptr, n);

#elif defined(SME_TARGET_OS_HAS_EXPLICIT_MEMSET)

    // NetBSD
    (void)::explicit_memset(ptr, 0, n);

#elif defined(SME_USE_VOLATILE_MEMSET_FOR_ZERO) && (SME_USE_VOLATILE_MEMSET_FOR_ZERO == 1)

    // MacOS
    /*
    Call memset through a static volatile pointer, which the compiler
    should not elide.
    */
    static void* (*const volatile memset_ptr)(void*, int, size_t) = std::memset;
    (memset_ptr)(ptr, 0, n);

#else

    // MacOS
    /*
    Cast to a volatile pointer and set each byte in a loop, which the compiler
    should not elide.
    */
    volatile uint8_t* p = reinterpret_cast<volatile uint8_t*>(ptr);

    for(size_t i = 0; i != n; ++i) {
        p[i] = 0;
    }

#endif

} // erase

} // Secure::Memory::Erase

extern "C" {

void secure_memory_erase(void * ptr, size_t n) {

    Secure::Memory::Erase::erase(ptr,n);

}

} // extern C

Secure/Memory/Erase.hsc:

{-# LANGUAGE CApiFFI #-}

module Secure.Memory.Erase where

import Prelude

import Foreign.C.Types
import Foreign.Ptr

#include "secure_memory_erase.h"

foreign import capi safe "secure_memory_erase.h secure_memory_erase"
    secureMemoryErase
        :: Ptr a -- ^ mem
        -> CSize -- ^ bytes
        -> IO ()

Which solution actually gets used depends on the cabal os flags - botan C++ uses a configure script step to detect the OS on install, correlating it with an OS-dependent capabilities list that gets turned into a set of CPP #ifdef flags. We do the same, using the cabal os instead:

if os(windows)
    cxx-options: -DSME_TARGET_OS_HAS_RTLSECUREZEROMEMORY
if os(linux) || os(freebsd) || os(openbsd)
    cxx-options: -DSME_TARGET_OS_HAS_EXPLICIT_BZERO
if os(netbsd)
    cxx-options: -DSME_TARGET_OS_HAS_EXPLICIT_MEMSET

It is not properly tested, however, and we could potentially support a wider / more fine-grained range of operating systems (botan C++ includes a rather extensive list).

I’d potentially like to ditch the C++ in favor of pure C, as well as adding a flag to use botan C++ as a dependency instead of the included source, but so far, it appears to be working.

Ideally, there should be also a Scrubbed or Secure typeclass that requires a secure erasure function, thus making the actual solution more of a choice, and then we’d simply be providing a sensible default secureMemoryErase function without getting in the way of other solutions. It may be a little convoluted, but this would allow botan to depend on secure-memory-erase while still using its own functions internally.

Between this and the bytearray-classes that I’ve been sussing out, I think I have separated out and established the critical functionality that should make an update to botan-low & botan actually meaningful, but also more broadly applicable.


Hey everyone, it’s good to be updating again. I’ve been away for health reasons - I’m doing rather well lately but I very consciously have to take it easy in order to maintain that. I may be a bit quiet or slow to respond, but I’m here, and I’ll be taking my time as to make sure I don’t deplete myself again.

7 Likes