Botan bindings devlog

Please consider this bit carefully and try to avoid these as much as possible, even for the final interface. It looks like there’s potential to overengineer here and make it hard for clients of the library to figure out how to use the API as these abstractions can significantly obscure it.

2 Likes

There is a long-term method to my madness; I am trying to keep the ergonomics of use in mind, but botan is primarily a cryptographic kitchen sink, and so it is my duty to expose the full complexity of every algorithm and primitive. botan is far closer to crypton than something like saltine.

However, I am planning for a user-friendly saltine -like interface as well, and I think your concerns align more closely there. The end goal is to expose a pure, idiomatic interface, but all of the mutable IO and machinery underneath is still necessary, and this is part of that important machinery, and not necessarily the final interface.

A few things to help address your concerns:

  1. I am being sure to use data families and not type families; this significantly reduces strain on the compiler, as data families do not inhibit type inference at usage sites like type families do, and instead places the onus of effort on me to fulfill the boilerplate.

  2. This was always going to be complex, no matter how I chose to express it, and having tried several ‘simpler’ routes, it turns out that while data families do have more initial overhead for development, their complexity for use grows much more slowly. In comparison, the ADT approach is easy to develop, and terrible to use.

  3. I am also writing a series of high-level tutorials focusing on explaining the correct usage, which in most cases consists of using a single type application or explicit type annotation. This interface affords a great deal of flexibility and specificity, as I have been quite thorough.

The following lines are all valid and equivalent:

sha3_512 "Fee fi fo fum!"
sha3 @512 "Fee fi fo fum!"
sha3 "Fee fi fo fum!" :: SHA3_512Digest
sha3 "Fee fi fo fum!" :: SHA3Digest 512
sha3 "Fee fi fo fum!" :: Digest (SHA3 512)
sha3 "Fee fi fo fum!" :: Digest SHA3_512
hash @(SHA3 512) "Fee fi fo fum!"
hash @SHA3_512 "Fee fi fo fum!"
hash "Fee fi fo fum!" :: SHA3_512Digest
hash "Fee fi fo fum!" :: SHA3Digest 512
hash "Fee fi fo fum!" :: Digest (SHA3 512)
hash "Fee fi fo fum!" :: Digest SHA3_512
-- 03a240a26674994a...

There is a great deal of freedom of choice, but from this we may expose a final, curated interface, eg by picking a particular idiom and emphasizing it in the tutorials.

  1. The cryptographic typeclasses are being designed to expose a consistent, implementation-agnostic cryptographic interface and will eventually be lifted out into their own package with botan continuing to conform to it. This will eventually allow alternate backends to fulfill the required primitives, while allowing more complex implementation-agnostic concepts like merkle trees to be implemented without locking yourself to one backend library. This will also allow the final curated interfaces to be reused across backends, and is part of the long-term goal of a unified haskell cryptography ecosystem.
3 Likes

Honestly, I still think the best abstraction here is backpack. At least at a conceptual level, it simply feels like the right things.

I’d be happy to have a chat about it, even if just to put it on your map.

In practice, however, it may be that backpack is not a viable option due to Stack’s lack of support and HLS — on the other hand, it could be a strong motivation to implement support for it

2 Likes

I do not at all oppose investigating the use of backpack to make swapping of implementation easier, but the ability to pull it out as an implementation-agnostic interface is merely a convenient side effect of being extremely stringent in acknowledging the unique properties of all of the various algorithms :grin:


I feel as though I need to clarify something for the general audience, however, as there appears to be some wariness or misunderstanding of why the typeclasses are being developed:

The typeclass abstractions are still necessary within botan as a single package, in order to expose multiple algorithms for the same primitive in a type-safe manner without throwing exceptions, or pretending that algorithm variants don’t exist.

Many algorithms have unique knobs and levers, and it is only at the very end, when we process the plaintext, that they act similarly. Digest MD5 is not the same type as Digest (SHA3 512), and should not be considered equal, even if they were to contain the same bytes.

Some algorithm families are parameterized by a unique salt / identifier / tweak that is not necessarily bounded in size, and thus there may be an infinite number of instances to write manually. Skein512 512 "foo" takes two additional parameters compared to MD5 - how are we to provide the additional arguments while providing a uniform interface?

Typeclasses were designed for this; the alternative is passing an algorithm witness around exactly like you might a typeclass implementation dictionary, while having to handle errors when people pass in a parameter that isn’t valid for a given algorithm / operation, which would be normally be handled by the type system via typeclass constraints.

I’d be writing the newtypes anyway for per-algorithm type-safety - take a look! The data families are effectively free and the typeclasses just codify everything that I’m doing anyway.

We have our data family and typeclass - not scary at all :slight_smile: :

data family Digest hash

class (Eq (Digest hash), Ord (Digest hash)) => Hash hash where
    hash :: ByteString -> Digest hash

We could implement simple concrete types for our algorithm this way, without the typeclass and data families:

newtype MD5Digest = MkMD5Digest { getMD5ByteString :: ByteString }
    deriving newtype (Eq, Ord)

md5 :: ByteString -> MD5Digest
md5 = MkMD5Digest . Botan.hash Botan.md5

However, conformance to the typeclasses is effectively free

-- Free
data MD5

-- Free by replacing "MD5Digest" with "instance Digest MD5"
newtype instance Digest MD5 = MkMD5Digest { getMD5ByteString :: ByteString }
    deriving newtype (Eq, Ord)

-- Unchanged
md5 :: ByteString -> MD5Digest
md5 = MkMD5Digest . Botan.hash Botan.md5

-- Free
type MD5Digest = Digest MD5

-- Free
instance Hash MD5 where
    hash :: ByteString -> Digest MD5
    hash = md5

Something with a few type parameters isn’t much harder.

data SHA3 (n :: Nat)

-- Pardon this atrocity
type SHA3Size (n :: Nat) = (KnownNat n, (n == 224 || n == 256 || n == 384 || n == 512) ~ True)

newtype instance Digest (SHA3 n) = MkSHA3Digest { getSHA3ByteString :: ByteString }
    deriving newtype (Eq, Ord)

instance (SHA3Size n) => Hash (SHA3 n) where
    hash :: ByteString -> Digest (SHA3 n)
    hash = sha3

-- The only actual work involves type literals
sha3 :: (SHA3Size n) => ByteString -> Digest (SHA3 n)
sha3 = MkSHA3Digest . Botan.hash h where
        n = fromInteger $ natVal $ Proxy @n
        h = fromJust $ Botan.sha3 

-- We still export user-friendly functions at the end
sha3_512 :: ByteString -> Digest (SHA3 512)
sha3_512 = sha3 @512

I want to stress that these typeclasses aren’t replacing anything, they’re only augmenting. I hope this gives a better idea as to the reasoning behind my approach.

2 Likes

Yes yes yes! That does clarify things, thank you.

That example is quite good. Looking forward to seeing more of this work

1 Like

I think instead of the atrocity, the following might be more helpful:

type family IsSHA3Size n where
    IsSHA3Size 224 = ()
    IsSHA3Size 256 = ()
    IsSHA3Size 384 = ()
    IsSHA3Size 512 = ()
    IsSHA3Size _ = TypeError "Make this a custom type error that indicates what the allowed numbers are"

type SHA3Size (n :: Nat) = (KnownNat n, IsSHA3Size n)

This way the error message will be a lot more helpful.

2 Likes

Update: botan 0.0.1.0 package candidate uploaded!

As promised, it’s here:

There’s still a good bit of work to be done before release, but I’ll be giving it the same level of love that botan-bindings and botan-low got while they were a package candidate - which was a lot.

First up, I’ve flattened the algorithm types for most operations (see issue #1 from earlier) with an overall positive result.

It is really visible in something like Botan.Cipher where we both have specific subsets that we need to deal with (Cipher vs AEAD), while also taking specific subsets of other operations as arguments (BlockCipher vs BlockCipher128). This new method is proving to be significantly better to use, so I’ll be finishing up and applying to the remaining operation / algorithm data types as necessary.

Secondly, I’ve decided to not elide the cryptography typeclasses from the 0.0.1 release (see issue #2 from earlier), but will still be postponing their continued development until after the 0.0.1 release, as to stay focused on the immediate needs - priortization! :upside_down_face:

The repo has been updated

4 Likes

Back in Action after getting some Rest

After getting botan-bindings and botan-low to the initial 0.0.1 release and botan to package candidate status, I found myself in need of a bit of a break - both because I needed to pull back and get a bird’s eye view of it, and because I’ve been working on this project almost non-stop for 8 months now and everyone deserves a bit of vacation now and then. :sleeping:

Now that’s over, and it is time to get back in action! As this project transitions to the next phase of its life cycle, I’ve been writing an official Haskell Cryptography Group milestone blog post and call-for-users, taking into account feedback on the project and the round 2 funding proposal (which I am also still working on :slight_smile:). I’ll be discussing the history of this project and how it all started, what we’ve accomplished so far, what feedback we’ve heard / users want, and what flag we’ve planted on the horizon as our goal, as well as concrete details on the specific things we wish to accomplish next.

You can expect regular updates to resume Monday next week!


Minor note: The project has now officially transferred ownership to the Haskell Cryptography Group, and as planned, the repository been moved:

7 Likes

A more sizeable update and proper First Milestone blog is coming soon :slight_smile: In the meantime, please enjoy this sneak peek at something I had fun implementing while on the break!

Just updating some old code to use botan-low

2 Likes

Returning from Absence.

I underestimated how much rest I needed, but in the words of Granny Weatherwax “I ate’nt dead”.

This last month has been half taking-a-break, half writing out the milestone blog and plans for the future, and half trying to figure out how to enable that future. For the moment, things are out of my hands, so instead, I’m going to do what I do best, which is the code, and update the devlog.

Things that have changed in the last few weeks:

  • Cryptographic typeclasses have been temporarily moved to its own branch to focus on releasing botan
  • Minor improvements to utility functions and error handling
  • Botan.RNG significantly reworked and cleaned up
  • Botan.KDF, PwdHash, HOTP, TOTP are now release-ready and have tutorials and explanations
  • Stable implementation errors in botan_pwdhash_timed are now handled properly - see […] confusing param mapping between pwdhash and pwdhash_timed
  • A minor update to botan-bindings and botan-low may be necessary to improve constants and organization of named elliptic curves, logarithm groups, and raw algorithm modes.

This isn’t pushed yet - I’ll let you known when it is.

8 Likes

Hey @ApothecaLabs, is there a good place to chat with you/the botan devs about development? I’m currently working through seeing how feasible it would be to swap crypton for botan in amazonka, and I’m running into a lot of issues with the design of the hashing API. I was wondering if there’s a good IRC channel or somewhere to chat - here feels like the wrong place but I can do that if needed.

Quick summary of the issues:

  • There’s no pure, incremental hashing API (you probably say my GitHub issue), which makes writing things like pure streaming sinks difficult - Monad m => Conduit ByteString () m (Digest a) can’t currently be written, it needs MonadIO instead (and with the current assumption that all hash state will be mutable updated, there could be bugs in how something that would get used, depending on m).
    As an aside here, linear types would make this API both pure and quite nice to work with - it probably shouldn’t be the only interface, but it would avoid the required copying of the context that a pure API would necessitate
  • The Digest data family is difficult to work with generically, it’s nearly impossible to write something that works on the wrapped ByteStrings in functions with types like Hash a => Digest a -> ByteString. I’d love to see a function digestBytes :: Digest a -> ByteString added to the Hash class, at the very least.
  • I’ve also grown quite fond of memory's ByteArrayAccess class and its ability to abstract types like digests soo they can easily be hashed too. I think that cryptonite got a lot write with its use of memory, forcing everything to be a ByteString limits things a lot for production code.

I’m sure I’ll run into more, but this is where I am at the moment. I’m aware that botan is unreleased, but wanted to at least investigate whether it’ll be usable in Amazonka. At the moment there isn’t a clear path that wouldn’t affect performance.

I should also mention that I’m happy to contribute the development of the changes I’d like to see - I should make a fork start playing around with ideas.

5 Likes

Since we’re on the topic of Amazonka, I should also say that I’d love to see botan become the cryptography library that it rests on, but there are a few things beyond @Axman6’s exploration that I’d also like to see happen:

  1. The changes to support FFI really should make it into botan upstream, so that we’re depending on the same vetted code as everyone else.
  2. The Haskell packages and the native library should have a good build/download story for all major platforms. I don’t want to accidentally cut off users on e.g., Windows if there’s no easy way to get them the native deps they need. (Also, AFAICT, Botan 3 is only in Debian experimental.)
  3. botan-bindings should not have to vendor any source code. I’m leery of packages which vendor source code, especially in the cryptography and security spaces.

These sound like pretty big asks to me, so I want to get them on the table now as a starting point for discussion. Maybe I’m being too dogmatic, or maybe they’re easier than I think, but let’s talk about them at least.

5 Likes

As already explained elsewhere: [Haskell Cryptography Group] Botan: The First Milestone - #14 by hasufell

3.3.0 is in debian experimental and oddly enough, windows is likely the least problematic platform.

I tried to create a botan-clib package, but the build system is a disaster: which source files to compile depends on platform and processor features that are detected during “configure”. You’ll have to map those things manually in the cabal file and create flags for all the cpu features, because the c source files used need to be encoded in the cabal file. And then review the build system changes on every version bump.

These are things that e.g. a gentoo or nixpkgs developer would be comfortable to do, but it isn’t at all trivial. And you lose the automagic detection of features.

Another way might be to abuse custom setup really hard, but that’s not gonna be very popular or easy either.

2 Likes

@Axman6 I am down for a video chat to discuss your needs in depth - just send me a private message and we’ll pick a time.

I have a lot to say in response to all this but saying it all would take time that I need to spend getting botan updated / released. So you get fragmented notes in the name of expediency :slight_smile: I hope they are not too insensible.

  • It is brave of you to be testing out the high-level interfaces when they are unstable

    • Super glad you are interested
    • I am trying to keep this in mind, but when I push the release (cross your fingers that its within a day or two) things may change (for the better I hope!)
    • The focus has been on getting everything working in at least one way - then, we can deal with different interfaces
  • Data families vs witness / proxy

    • I don’t mind providing more than one interface, I just happen to like data families
      • Am planning on interfaces that also take a alg value witness or Proxy alg as an alternative, like crypton/ite
    • Most other cryptography operations have multiple types that must work together so data families really works for them
    • Witnesses are simple when you only have a single operation to apply them to
      • Witnesses get hard when you have to apply them to every function in a multi-step operation
      • Or when you need multiple or compound algorithms
    • Your specific use case is the classic weak spot for data family type inference, eg when:
      • Inferring from the return type as the only mention of the actual type as you cited
      • Inferring a type that is not visible to the outer scope (eg, digestBytes $ hash "foo")
    • Hashing only has one related type, which is also produced as an output
      • This means hashing in particular doesn’t work as well with data families, but it does work well with witnesses
      • Being consistent overall is worth the tradeoff though
      • We can still expose a witness / proxy -based interface too so this pain is temporary
  • There is a difference between incremental / online and mutable

    • I’m not surprised you are having issues because I’ve been wanting to improve this are in particular - its confusing and so you should expect a lot of change here in the coming update(s)
    • Incremental is really about online ie chunked processing, I want to reduce it to Strict vs Lazy ByteStrings so a typeclass may then be unnecessary
    • Mutable is really about stateful contexts
      • A crypton/ite -style mutable interface is planned
  • Pure interfaces - only Hash has copyState, so specifically for hashes it is possible to make a stepwise pure interface out of a mutable interface, but for other operations it is not

    • This is actually sensible if you stop to think about it
      • Most cryptographic operations, copying state is highly undesirable or an outright information leak, so they don’t
      • hashing is a rare exception as you may be hashing things sharing the same prefix, so there is a major efficiency gain
      • The same is not necessarily true for other operations eg encryption and signing, especially anything using nonces
    • The botan hash state is mutable, so I am highly averse to pretending that it is pure
      • Need to guarantee that the IO actions are only run once each, and that they are run in the right order
    • Can make a copy of the hash state for every ‘pure’ function, but this is inefficient - but it is possible
    • Can pretend that its pure if we pretend that it is linear (ie, the result is only used once as an argument for the next action)
    • Trying to avoid a class-splosion
    • A similar issue regarding pure vs mutable interfaces can be seen in random with RandomGen vs StatefulGen
  • Regarding memory, I too liked the ByteArrayAccess class and friends, however there are issues that make me hesitant to include it as a dependency

    • The source repository has been archived by the author and is read-only
    • It does not appear to be maintained (and so may warrant initiating a takeover process)
    • botan-low has an absolutely minimal dependency footprint (aside from the botan library itself)
    • However, I have already been considering that it may be worth reviving those classes due to their utility

I hope that is readable and relevant. It will make more sense when I publish the code.

1 Like

[memory] does not appear to be maintained (and so may warrant initiating a takeover process)

It is confirmed that none of Vincent Hanquez’ packages will be maintained, nor has he given an ok for taking it over. He’s ok with anyone forking it and using a new name, so that’s what’s been happening with crypton(ite) and (crypton-)x509 etc.

Xeno’s Paradox Progress

I’ll keep this short; progress toward an update / release is ongoing; I feel as though every update we get about half the remaining work done, but discover more to do. However, an update is good, and I’m trying to not stress about it :slight_smile: It has gotten a bit slow because now I might spend the entire day chasing down one specific operation on one specific algorithm in order to verify that it is well-behaved. The closer you look, the more detail there is to be seen.

Regardless, progress:

  • All the botan modules are now 100% functional-complete
  • Significant reorganization of modules to reduce their number
  • Lots of improvements to PubKey, including post-quantum Kyber / Dilithium algorithm parameters
  • There are proper newtypes for everything, and so type safety is significantly improved
  • Basic unit tests for botan are in progress, and are the major blocker for release now
  • Some massaging / improvements are still needed per catching exceptions and turning them into Either / Maybe when appropriate
    • This bears pondering, as there are some failures where an exception is far more appropriate
    • An example: encrypt is a general function for multiple algorithms; some algorithms have specific handling requirements for plaintext length and chunking, and failure to validate may throw an exception if the function is provided inappropriate arguments - this failure is different from failing to decrypt because of using the wrong key or auth and should return Nothing rather than throwing an exception.
    • This is important because otherwise we have to make a function like encrypt return a Maybe Ciphertext even though encryption really shouldn’t fail the same way decrypt can. This would result in an un-ergonomic interface.
    • I am strongly leaning towards providing validation functions for input, throwing exceptions in the standard functions (and maybe an Either variant that doesn’t), and saving Maybe for functions like decrypt that are expected to fail in specific ways.
  • All of the typeclasses / type family / algorithm-specific modules are still separated out for the moment, will need some minor merging back in, and may not make the initial botan release (as to better focus on the immediate needs)

I’m trying to get this out sooner rather than later, but it will require a minor update to the lower libraries, and I’ll need to make sure I don’t goof that up - I’m thinking I’ll bump botan-bindings and botan-low from 0.0.1 to 0.1.0 because I think there a few minor breaking changes, and release the first version of botan as 0.1.0.

5 Likes

What to do about memory re: crypton/ite?

I’ve been absent for some time due to the need to focus on personal health. It can be a struggle to reach out at times like these, but this last month I’ve not been idle. Herein lie the results of my investigation in the meantime.

In order for botan to provide a palatable alternative to crypton/ite, it is important to understand how people are using it, and what dependencies it implies. One of those dependencies is memory, which supplies a variety of typeclasses and data types regarding low-level pointers and memory. This package, like cryptonite and by the same author, has been archived, and will not be updated in the future.

This is problematic, because even after forking crypton/ite, many libraries still have a transitive dependency on memory. The reason for this is that one of the prominent design choices was to use constraints instead of concrete data types in many of the typeclasses and functions.

For example:

class Cipher cipher where
    ...
    cipherInit :: ByteArray key => key -> CryptoFailable cipher 
    ...
class Cipher cipher => BlockCipher cipher where
    ...
    ecbEncrypt :: ByteArray ba => cipher -> ba -> ba
    ecbDecrypt :: ByteArray ba => cipher -> ba -> ba 
    ...

Thankfully, most algorithm-specific outputs are wrapped in a concrete newtype.

hash :: (ByteArrayAccess ba, HashAlgorithm a) => ba -> Digest a

That does seem to be the general rule, that unstructured blobs of binary data be represented a ByteArray, mostly plaintext / ciphertext / key / salt / password -type data, but it can be slightly onerous when parameters are allowed to be heterogenous

bcrypt
    :: (ByteArray salt, ByteArray password, ByteArray output)	 
    => Int -> salt -> password -> output

As a result, unless one is dealing exclusively with ByteString, and / or the data types supplied by crypton/ite itself, use of memory is almost compulsory. Furthermore, if we wish to provide a similar interface using botan as a backend, this means either accepting memory as a dependency, or doing something about it.

As such, I have spent a lot of time familiarizing myself with its guts and how it is used. I really wanted to know the intent behind many of the decisions the author made (such as how the different allocation / create functions need to be used re: inlining and unsafePerformIO) because it doesn’t do to simply copy and paste. Here’s what I have learned:

The following libraries use both crypton/ite and memory, but there are many more.

  • amazonka-core
  • stack
  • tls
  • password
  • x509

One can look at the reverse dependencies for crypton/ite and memory to see the general cross-section. Thankfully most dependents of crypton/ite found ByteString and the supplied data types sufficient, and so they did not promulgate the issue by depending on memory directly, only indirectly.

So, what is memory?

It is a small, innocuous package, mostly exposing the following classes that allow ByteString and primitive:Data.Primitive.ByteArray-like access to pointers to contiguous memory:

-- Data.ByteArray

class ByteArrayAccess ba where
    length :: ba -> Int
    withByteArray :: ba -> (Ptr p -> IO a) -> IO a
    copyByteArrayToPtr :: ba -> Ptr p -> IO ()

class (Eq ba, Ord ba, Monoid ba, ByteArrayAccess ba) => ByteArray ba where

    allocRet :: Int	-> (Ptr p -> IO a) -> IO (a, ba)

data Bytes

data ScrubbedBytes

-- Data.ByteArray.Sized

class (ByteArrayAccess c, KnownNat n) => ByteArrayN (n :: Nat) c | c -> n where
    allocRet :: forall p a. Proxy n -> (Ptr p -> IO a) -> IO (a, c)

data SizedByteArray (n :: Nat) ba

These classes allow for fast and efficient manipulation of memory, and basically wrap up and classify a lot of low-level functions that GHC uses internally - incredibly powerful, and about as far as you can go without dropping down to using primitive and MagicHash directly.

Notably, the package does not distinguish between pinned and unpinned memory, though the unspoken assumption is that the memory pointed to in withByteArray :: ba -> (Ptr p -> IO a) -> IO a will not be moved for the duration of that operation.

Aside from a few minor quibbles of hierarchy and nomenclature for purposes of consistency with Data.Primitive.ByteArray and Foreign.* and for better separation of bytewise functions vs elementwise functions, it is a useful set of classes. I see no reason to make large changes to it.

So, how is memory being used

I do wish to understand how memory is being used, and so I have gone over many of the direct dependents in order to collate their use cases.

A few particular use cases dominate:

  • ByteArrayAccess and ByteArray classes
  • convert and constEq functions
  • convertToBase functions

Aside from that, there are several errata in the package that we must also consider. Notable:

  • ScrubbedBytes
  • SizedByteArray and ByteArrayN

ByteArrayAccess and ByteArray

By and far, the most-oft used parts of this package are the classes themselves. The major user of these classes is the crypton/ite ecology, and are the core of the package. Enough said - high priority.

convert and constEq

If they aren’t using ByteArray as a general ByteString-like class, directly, chances are they using memory in order to convert between ByteArray-conforming types. This is actually quite sensible, as many cryptography-related packages have some modules / classes / functions for converting various primitives / newtypes to and from bytestrings.

The other common use is to pull in constEq for constant-time equality checks for sensitive information.

Quite often, the memory package is pulled in for just for one or both of these. Between that and the classes themselves, we’ve covered the majority of memory use cases

Base and convertToBase

Seeing occasional use, memory also exposes functionality for converting between various bases in the Data.ByteArray.Encoding module. This is important, but not critical, because there are already packages for converting bytestrings to various bases.

SizedByteArray and ByteArrayN

Seeing infrequent use (eg, NaCl). Essentially a Nat-parameterized ByteArray, it might have been called FiniteByteArray if being consistent with Bits and FiniteBits.

ScrubbedBytes

Seeing infrequent use (eg, NaCl).

memory supplies a ScrubbedBytes class for sensitive memory that protects its Show instance and is automatically zeroed before release. This is a really important but also niche use, and I only observed one or two packages using this functionality.

Here is how the zeroing of memory works for ScrubbedBytes:

getScrubber :: Addr# -> State# RealWorld -> State# RealWorld
getScrubber addr s =
    let IO scrubBytes = memSet (Ptr addr) 0 (I# sz)
     in case scrubBytes s of
            (# s', _ #) -> s'

Essentially, it is using extremely low-level RealWorld magic to dynamically vend a memSet function call in the finalizer of ScrubbedBytes, in an effort to avoid the compiler eliding the zeroing due to the following free (that is, many assemblers will ignore memSet if it is immediately followed by free, because such a memSet is normally considered to be a no-op).

Even if a compiler emits the instructions to zero memory in the compiled binary, that is no guarantee that it will be run, because the CPU may still optimize it out during runtime. As a result, most operating systems provide a specific OS-dependent function for this in order to ensure that the memory is zeroed, and the volatile keyword and memset_explicit function have only recently been properly standardized in C.

I cannot attest to the efficacy of this functionality regarding GHC specifically, but for reference, here is how Botan::secure_scrub_memory works in C++

void secure_scrub_memory(void* ptr, size_t n) {
#if defined(BOTAN_TARGET_OS_HAS_RTLSECUREZEROMEMORY)
    ::RtlSecureZeroMemory(ptr, n);

#elif defined(BOTAN_TARGET_OS_HAS_EXPLICIT_BZERO)
    ::explicit_bzero(ptr, n);

#elif defined(BOTAN_TARGET_OS_HAS_EXPLICIT_MEMSET)
    (void)::explicit_memset(ptr, 0, n);

#elif defined(BOTAN_USE_VOLATILE_MEMSET_FOR_ZERO) && (BOTAN_USE_VOLATILE_MEMSET_FOR_ZERO == 1)
    static void* (*const volatile memset_ptr)(void*, int, size_t) = std::memset;
    (memset_ptr)(ptr, 0, n);
#else

    volatile uint8_t* p = reinterpret_cast<volatile uint8_t*>(ptr);

    for(size_t i = 0; i != n; ++i)
        p[i] = 0;
#endif
}

Replacement of this functionality is considered important, but as a low-level priority compared to the higher-priority functionalities.

Other errara

Outside of that, there are portions of memory that see little to no actual use:

  • Parser and Packer modules
  • Endianness
  • Word128
  • The SipHash and FNVHash modules

I do not think we need to worry about these for the time being.

So, what do we do about it?

I think it would be wise to provide ByteArray-like functionality in botan for better compatibility with crypton/ite and easier migration, and this assessment helps illustrate what capability needs to be preserved, and what can be ignored.

Thus I am considering creating a slimmed down bytearray-classes package, that plucks out the necessary classes and functions in order to allow us to depend on it instead of memory.

We definitely want the bulk of these modules, maybe ignoring the concrete data types and focusing only on the classes:

  • Data.ByteArray
  • Data.ByteArray.Encoding
  • Data.ByteArray.Sized
  • We also probably do want to make some data types a la Bytes and ScrubbedBytes

Using bytearray-classes

Migration would ideally be simple, as I don’t intend on changing the interfaces much, but this actually leads to a rather thorny problem and important decision - where in botan does this library get used? The answer to that is slightly frustating, and one that gives me no small amount of anxiety:

For improving compatibility / ease of migration, the best place to use bytearray-classes and ByteArray constraints like crypton/ite is in botan-low.

The reasoning here is that the concrete ByteString functions can be obtained from the ByteArray functions simply by restricting the types, and we would be able to make all of the internal logic and function generators for allocating ByteStrings much more consistent.

This of course depends on whether people actually prefer a ByteArray constraint over a concrete ByteString parameter at all, as you may find use of TypeApplications suddenly necessary.

5 Likes

Regarding amazonka-core: we only seem to directly use memory for base{16,64} encoding and it might be easy to remove. Issue

Ah, it is not, but I suspect this is a major cause of direct dependencies on memory:

This is not straightforward: the various Digest as that we get from crypton can only be interacted with using the ByteArrayAccess interface from memory, which is why amazonka-core has polymorphic functions for base{16,64} encoding. And if we move to botan we’ll be able to get encoders from there anyway, so there’s little point in doing this refactoring twice.

The password library also doesn’t really need (or at least want) to use memory.

The convert to Bytes we have is only in there because benchmarks showed it is slightly more performant and if we’re depending on memory already (because of crypton(ite)), then why not.
The one we DO want/need is the constEq for security reasons. Though I feel that is something that could also just be added to the bytestring library. Then we’d use that one.

2 Likes

What to do about memory, part 2: Secure memory erasure

Aside from the ByteArray classes, there is another feature that I wish to extract from memory, and that is the ScrubbedBytes, or rather, the functionality that backs it.

What is ScrubbedBytes?

I got into it a bit last post, briefly covering ScrubbedBytes as one of the occasioanlly used features from memory.

To quote the documentation:

ScrubbedBytes is a memory chunk which have the properties of:

    - Being scrubbed after its goes out of scope.
    - A Show instance that doesn't actually show any content
    - A Eq instance that is constant time

Data types like these are used to avoid accidentally leaking sensitive data and / or personally identifiable information through means such as logs and core dump, and any production team worth their salt has a process set up to handle this (usually accompanied by / piped straight into analytics).

Breaking it down

  1. The memory is scrubbed after it goes out of scope.

When all references to a ScrubbedBytes instance are gone, the finalizer runs, and the data will be explicitly zeroed out. This is great! Kind of…

But what does it actually do? Well, if we peek into the source code of newScrubbedBytes, we see that ScrubbedBytes is just an aligned, pinned byte array with a finalizer. The finalizer contains a dynamically-generated scrubber function, which eventually calls memset on the memory region.

ScrubbedBytes.hs:

newScrubbedBytes :: Int -> IO ScrubbedBytes
newScrubbedBytes (I# sz)
    -- Elided zero-size cases
    | otherwise               = IO $ \s ->
        case newAlignedPinnedByteArray# sz 8# s of
            (# s1, mbarr #) ->
                let !scrubber = getScrubber (byteArrayContents# (unsafeCoerce# mbarr))
                    !mba      = ScrubbedBytes mbarr
                in case mkWeak# mbarr () (finalize scrubber mba) s1 of
                    (# s2, _ #) -> (# s2, mba #)
where
    getScrubber :: Addr# -> State# RealWorld -> State# RealWorld
    getScrubber addr s =
        let IO scrubBytes = memSet (Ptr addr) 0 (I# sz)
        in case scrubBytes s of
                (# s', _ #) -> s'
    finalize :: (State# RealWorld -> State# RealWorld) -> ScrubbedBytes -> State# RealWorld -> (# State# RealWorld, () #)
    finalize scrubber mba@(ScrubbedBytes _) = \s1 ->
        case scrubber s1 of
            s2 -> case touch# mba s2 of
                    s3 -> (# s3, () #)

PtrMethods.hs:

memSet :: Ptr Word8 -> Word8 -> Int -> IO ()
memSet start v n = c_memset start v (fromIntegral n) >>= \_ -> return ()

foreign import ccall unsafe "memset"
    c_memset :: Ptr Word8 -> Word8 -> CSize -> IO ()

This attempts to avoid leaking sensitive data in core dumps and memory (I believe it is attempting to use the crossing of FFI / module boundaries in between the memset and free to ensure it is not elided), and is functionality that needs to be preserved, but it’s just calling memset from the C standard library, and it isn’t really exposed in a manner that I am comfortable with, either. In the absence of another mechanism, relying on GHC triggering GC to clean up sensitive memory is just begging for trouble.

Also, this is all done in an attempt to prevent the compiler from eliding the memset because it occurs immediately before a free. Many compilers will automatically remove any writes to a memory region that is about to be freed, which has been a significant thorn in the side of computer security, and has caused countless leaks - there are better options than memset, more on that later.

I would prefer a bracketed withScrubbedBytes -style function over relying on a data type with finalizers, but it is really the secure / guaranteed erasure capability that we are after, and using it in a finalizer is better than nothing.

Notably, this is something that cannot be handled in Haskell - it requires dropping down to C.

  1. A Show instance that doesn’t actually show any content

When we print a value, we don’t want to show sensitive data or PII, so instead we sanitize it. It doesn’t stop you from directly accessing the raw contents and intentionally printing them, but it can save you from accidental exposure, eg if a debug statement makes it into production

This isn’t really specific to any data type or function or even language. Any language with a Show or Printable can override their debug / print / log instances to avoid any sensitive data from being exposed. Thus, preserving this functionality doesn’t require any special effort.

Perhaps instead of requiring the logic in Show, it could be codified in a typeclass SecureShow that provides a printing formatter that automatically elides secure sub-values via Generics or something. However, that requires then that there be a secure logging mechanism that utilizes it, and is a bit far out for the moment.

  1. A Eq instance that is constant time

When we compare sensitive data for equality, the goal is to report whether or not they are equal, and nothing more.

This requires special handling, because if you short-circuit the comparison, how long it takes to compare them may leak information - eg, if we stop comparing at the first byte because they are different, you know something about where in the data the difference occurs, and thus the implementation leaks more information than just ‘is equal or not’.

This is not especially problematic, as we simply need ensure that the entire data is compared before returning the result.

Additionally, constEq is already part of the ByteArray classes, and can be preserved from there.

Replacing memset

Now, with our properties in mind, we can really focus on the big gotcha, which is replacing use of memset. But why? First, some history:

In the beginning, the Universe C standard was created without any concern for data security. This has made a lot of programs very insecure and been widely regarded as a bad move.

Compilers back then were not nearly as smart, and CPU design was much simpler; when you issued a memset call, you could expect that that memory region would be set. Then compilers got smarter, and started optimizing. One of those optimizations is to discard any writes to memory that is about to be freed, and another optimization is to leave the content of malloc uninitialized. You may see where this is going - all of a sudden, data that had been cleared out using memset was reappearing in newly malloced data. This is not a problem if calloc is used, but it is slower, and this issue retroactively affected older code, and we’ve been paying for it ever since.

There have since been developed three general different ways of handling secure memory erasure:

  1. Adding a secure memset variant to the C standard, which is guaranteed not to be elided
  • memset (C89)
  • memset_s (C11)
  • memset_explicit (C23)

memset is the original function we are trying to replace. C11 added the memset_s function for secure erasure.

memset_s() can not be elided: K.3.7.4.1: Unlike memset, any call to the memset_s function shall be evaluated strictly according to the rules of the abstract machine as described in (5.1.2.3). That is, any call to the memset_s function shall assume that the memory indicated by s and n may be accessible in the future and thus must contain the values indicated by c.

Unfortunately, it was both improperly specified, and added as a part of Annex K, which was not a required part of the standard, and thus not widely implemented.

As a replacement for memset_s, C23 specifies the function memset_explicit, which should be preferred in new code. However, C23 is not yet finalized, and thus cannot be assumed to be available yet.

Thus, we should prefer to call memset_explicit if at all possible. I’m not sure whether we should even try to call memset_s, more investigation is needed.

  1. Calling an OS-specific secure memset variant
  • explicit_memset (NetBSD, Oracle)
  • RtlSecureZeroMemory (Windows)
  • explicit_bzero (FreeBSD, OpenBSD, Linux)
  • memzero_explicit (Linux)

Our secondary preference is to call an OS-specific function explicitly intended for securely erasing memory. There are a plethora of these, and we can select which one at compile-time by detecting the operating system.

We should prefer to call these if and only if memset_explicit is not available.

  1. Using volatile function or memory pointers to avoid being optimized out
  • calling memset on the memory region through a volatile function pointer
  • writing to the memory region through a volatile pointer
  • writing an extern value to the memory region

We should fall back on these if and only if no other solution is available. Notably, MacOS / clang uses this fallback in botan, with an accompanying note.

  1. Increasingly wild alternative methods
  • using mmap + MAP_ANONYMOUS + MAP_PRIVATE
  • using a function from a widely-used library (eg OpenSSL, libsodium, botan) which does 1-3 for you

These can’t be relied upon, for various reasons. Also, mmap is POSIX and MAP_ANONYMOUS was just added in POSIX 8.

Doing it properly vs doing what we can

Doing it properly requires deeper support than just C- or ASM-level. Modern CPUs are incredibly complex, with multiple levels of cache to handle, and in some ways, the problem has just moved inside of the CPUs, where the internal microarchitecture may still JIT discard / optimize instructions out of the machine code unless the proper instructions are used. The real solution is for the OS to keep keys / sensitive data / PII inside of a secure enclave, but that’s OS & hardware level, and “Dont ever load it in local memory” is not a solution that is always available.

We can’t handle that here, so the methods described are what we can do at the language level - we do the best to provide a solution for when we DO have sensitive information in local memory.

secure-memory-erase

To that end, I’ve plucked out and stripped down the secure-memory-erase code from botan C++, and have been looking at / comparing it to similar solutions in other libraries like libsodium. Right now it is a single C/C++ header + source file, which exposes a single C function, which is in turn foreign imported by an hsc file.

secure_memory_erase.h:

#pragma once

#include <stddef.h>
#include <stdint.h>

#ifdef __cplusplus
extern "C" {
#endif

void secure_memory_erase(void * mem, size_t bytes);

#ifdef __cplusplus
} // extern C
#endif

secure_memory_erase.cpp:

#include "secure_memory_erase.h"

#include <stdio.h>

#if defined(SME_TARGET_OS_HAS_RTLSECUREZEROMEMORY)

    // Windows: void RtlSecureZeroMemory(void * ptr, size_t n);
    // https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-rtlsecurezeromemory
    #define NOMINMAX 1
    #define _WINSOCKAPI_  // stop windows.h including winsock.h
    #define WIN32_LEAN_AND_MEAN // stop windows.h including lots of things
    #include <windows.h>
    #undef NOMINMAX
    #undef _WINSOCKAPI_
    #undef WIN32_LEAN_AND_MEAN

#elif defined(SME_TARGET_OS_HAS_EXPLICIT_BZERO)

    // Linux, FreeBSD, OpenBSD: void explicit_bzero(void *b, size_t len);
    // https://man.freebsd.org/cgi/man.cgi?query=explicit_bzero
    #include <string.h>

#elif defined(SME_TARGET_OS_HAS_EXPLICIT_MEMSET)

    // NetBSD: void * explicit_memset(void *b, int c, size_t len);
    // https://man.netbsd.org/NetBSD-8.0/explicit_memset.3
    // NOTE: C23 is memset_explicit, *NOT* NetBSD explicit_memset
    #include <string.h>

#endif

namespace Secure::Memory::Erase {

void erase(void * ptr, size_t n) {

    // TODO: If C23, just call memset_explicit like libsodium
    // TODO: If C11 and supported, maybe call memset_s

#if defined(SME_TARGET_OS_HAS_RTLSECUREZEROMEMORY)

    // Windows
    ::RtlSecureZeroMemory(ptr, n);

#elif defined(SME_TARGET_OS_HAS_EXPLICIT_BZERO)

    // FreeBSD, OpenBSD, Linux
    ::explicit_bzero(ptr, n);

#elif defined(SME_TARGET_OS_HAS_EXPLICIT_MEMSET)

    // NetBSD
    (void)::explicit_memset(ptr, 0, n);

#elif defined(SME_USE_VOLATILE_MEMSET_FOR_ZERO) && (SME_USE_VOLATILE_MEMSET_FOR_ZERO == 1)

    // MacOS
    /*
    Call memset through a static volatile pointer, which the compiler
    should not elide.
    */
    static void* (*const volatile memset_ptr)(void*, int, size_t) = std::memset;
    (memset_ptr)(ptr, 0, n);

#else

    // MacOS
    /*
    Cast to a volatile pointer and set each byte in a loop, which the compiler
    should not elide.
    */
    volatile uint8_t* p = reinterpret_cast<volatile uint8_t*>(ptr);

    for(size_t i = 0; i != n; ++i) {
        p[i] = 0;
    }

#endif

} // erase

} // Secure::Memory::Erase

extern "C" {

void secure_memory_erase(void * ptr, size_t n) {

    Secure::Memory::Erase::erase(ptr,n);

}

} // extern C

Secure/Memory/Erase.hsc:

{-# LANGUAGE CApiFFI #-}

module Secure.Memory.Erase where

import Prelude

import Foreign.C.Types
import Foreign.Ptr

#include "secure_memory_erase.h"

foreign import capi safe "secure_memory_erase.h secure_memory_erase"
    secureMemoryErase
        :: Ptr a -- ^ mem
        -> CSize -- ^ bytes
        -> IO ()

Which solution actually gets used depends on the cabal os flags - botan C++ uses a configure script step to detect the OS on install, correlating it with an OS-dependent capabilities list that gets turned into a set of CPP #ifdef flags. We do the same, using the cabal os instead:

if os(windows)
    cxx-options: -DSME_TARGET_OS_HAS_RTLSECUREZEROMEMORY
if os(linux) || os(freebsd) || os(openbsd)
    cxx-options: -DSME_TARGET_OS_HAS_EXPLICIT_BZERO
if os(netbsd)
    cxx-options: -DSME_TARGET_OS_HAS_EXPLICIT_MEMSET

It is not properly tested, however, and we could potentially support a wider / more fine-grained range of operating systems (botan C++ includes a rather extensive list).

I’d potentially like to ditch the C++ in favor of pure C, as well as adding a flag to use botan C++ as a dependency instead of the included source, but so far, it appears to be working.

Ideally, there should be also a Scrubbed or Secure typeclass that requires a secure erasure function, thus making the actual solution more of a choice, and then we’d simply be providing a sensible default secureMemoryErase function without getting in the way of other solutions. It may be a little convoluted, but this would allow botan to depend on secure-memory-erase while still using its own functions internally.

Between this and the bytearray-classes that I’ve been sussing out, I think I have separated out and established the critical functionality that should make an update to botan-low & botan actually meaningful, but also more broadly applicable.


Hey everyone, it’s good to be updating again. I’ve been away for health reasons - I’m doing rather well lately but I very consciously have to take it easy in order to maintain that. I may be a bit quiet or slow to respond, but I’m here, and I’ll be taking my time as to make sure I don’t deplete myself again.

11 Likes