Please consider this bit carefully and try to avoid these as much as possible, even for the final interface. It looks like there’s potential to overengineer here and make it hard for clients of the library to figure out how to use the API as these abstractions can significantly obscure it.
There is a long-term method to my madness; I am trying to keep the ergonomics of use in mind, but botan
is primarily a cryptographic kitchen sink, and so it is my duty to expose the full complexity of every algorithm and primitive. botan
is far closer to crypton
than something like saltine
.
However, I am planning for a user-friendly saltine
-like interface as well, and I think your concerns align more closely there. The end goal is to expose a pure, idiomatic interface, but all of the mutable IO and machinery underneath is still necessary, and this is part of that important machinery, and not necessarily the final interface.
A few things to help address your concerns:
-
I am being sure to use data families and not type families; this significantly reduces strain on the compiler, as data families do not inhibit type inference at usage sites like type families do, and instead places the onus of effort on me to fulfill the boilerplate.
-
This was always going to be complex, no matter how I chose to express it, and having tried several ‘simpler’ routes, it turns out that while data families do have more initial overhead for development, their complexity for use grows much more slowly. In comparison, the ADT approach is easy to develop, and terrible to use.
-
I am also writing a series of high-level tutorials focusing on explaining the correct usage, which in most cases consists of using a single type application or explicit type annotation. This interface affords a great deal of flexibility and specificity, as I have been quite thorough.
The following lines are all valid and equivalent:
sha3_512 "Fee fi fo fum!"
sha3 @512 "Fee fi fo fum!"
sha3 "Fee fi fo fum!" :: SHA3_512Digest
sha3 "Fee fi fo fum!" :: SHA3Digest 512
sha3 "Fee fi fo fum!" :: Digest (SHA3 512)
sha3 "Fee fi fo fum!" :: Digest SHA3_512
hash @(SHA3 512) "Fee fi fo fum!"
hash @SHA3_512 "Fee fi fo fum!"
hash "Fee fi fo fum!" :: SHA3_512Digest
hash "Fee fi fo fum!" :: SHA3Digest 512
hash "Fee fi fo fum!" :: Digest (SHA3 512)
hash "Fee fi fo fum!" :: Digest SHA3_512
-- 03a240a26674994a...
There is a great deal of freedom of choice, but from this we may expose a final, curated interface, eg by picking a particular idiom and emphasizing it in the tutorials.
- The cryptographic typeclasses are being designed to expose a consistent, implementation-agnostic cryptographic interface and will eventually be lifted out into their own package with botan continuing to conform to it. This will eventually allow alternate backends to fulfill the required primitives, while allowing more complex implementation-agnostic concepts like merkle trees to be implemented without locking yourself to one backend library. This will also allow the final curated interfaces to be reused across backends, and is part of the long-term goal of a unified haskell cryptography ecosystem.
Honestly, I still think the best abstraction here is backpack. At least at a conceptual level, it simply feels like the right things.
I’d be happy to have a chat about it, even if just to put it on your map.
In practice, however, it may be that backpack is not a viable option due to Stack’s lack of support and HLS — on the other hand, it could be a strong motivation to implement support for it
I do not at all oppose investigating the use of backpack to make swapping of implementation easier, but the ability to pull it out as an implementation-agnostic interface is merely a convenient side effect of being extremely stringent in acknowledging the unique properties of all of the various algorithms
I feel as though I need to clarify something for the general audience, however, as there appears to be some wariness or misunderstanding of why the typeclasses are being developed:
The typeclass abstractions are still necessary within botan
as a single package, in order to expose multiple algorithms for the same primitive in a type-safe manner without throwing exceptions, or pretending that algorithm variants don’t exist.
Many algorithms have unique knobs and levers, and it is only at the very end, when we process the plaintext, that they act similarly. Digest MD5
is not the same type as Digest (SHA3 512)
, and should not be considered equal, even if they were to contain the same bytes.
Some algorithm families are parameterized by a unique salt / identifier / tweak that is not necessarily bounded in size, and thus there may be an infinite number of instances to write manually. Skein512 512 "foo"
takes two additional parameters compared to MD5
- how are we to provide the additional arguments while providing a uniform interface?
Typeclasses were designed for this; the alternative is passing an algorithm witness around exactly like you might a typeclass implementation dictionary, while having to handle errors when people pass in a parameter that isn’t valid for a given algorithm / operation, which would be normally be handled by the type system via typeclass constraints.
I’d be writing the newtypes anyway for per-algorithm type-safety - take a look! The data families are effectively free and the typeclasses just codify everything that I’m doing anyway.
We have our data family and typeclass - not scary at all :
data family Digest hash
class (Eq (Digest hash), Ord (Digest hash)) => Hash hash where
hash :: ByteString -> Digest hash
We could implement simple concrete types for our algorithm this way, without the typeclass and data families:
newtype MD5Digest = MkMD5Digest { getMD5ByteString :: ByteString }
deriving newtype (Eq, Ord)
md5 :: ByteString -> MD5Digest
md5 = MkMD5Digest . Botan.hash Botan.md5
However, conformance to the typeclasses is effectively free
-- Free
data MD5
-- Free by replacing "MD5Digest" with "instance Digest MD5"
newtype instance Digest MD5 = MkMD5Digest { getMD5ByteString :: ByteString }
deriving newtype (Eq, Ord)
-- Unchanged
md5 :: ByteString -> MD5Digest
md5 = MkMD5Digest . Botan.hash Botan.md5
-- Free
type MD5Digest = Digest MD5
-- Free
instance Hash MD5 where
hash :: ByteString -> Digest MD5
hash = md5
Something with a few type parameters isn’t much harder.
data SHA3 (n :: Nat)
-- Pardon this atrocity
type SHA3Size (n :: Nat) = (KnownNat n, (n == 224 || n == 256 || n == 384 || n == 512) ~ True)
newtype instance Digest (SHA3 n) = MkSHA3Digest { getSHA3ByteString :: ByteString }
deriving newtype (Eq, Ord)
instance (SHA3Size n) => Hash (SHA3 n) where
hash :: ByteString -> Digest (SHA3 n)
hash = sha3
-- The only actual work involves type literals
sha3 :: (SHA3Size n) => ByteString -> Digest (SHA3 n)
sha3 = MkSHA3Digest . Botan.hash h where
n = fromInteger $ natVal $ Proxy @n
h = fromJust $ Botan.sha3
-- We still export user-friendly functions at the end
sha3_512 :: ByteString -> Digest (SHA3 512)
sha3_512 = sha3 @512
I want to stress that these typeclasses aren’t replacing anything, they’re only augmenting. I hope this gives a better idea as to the reasoning behind my approach.
Yes yes yes! That does clarify things, thank you.
That example is quite good. Looking forward to seeing more of this work
I think instead of the atrocity, the following might be more helpful:
type family IsSHA3Size n where
IsSHA3Size 224 = ()
IsSHA3Size 256 = ()
IsSHA3Size 384 = ()
IsSHA3Size 512 = ()
IsSHA3Size _ = TypeError "Make this a custom type error that indicates what the allowed numbers are"
type SHA3Size (n :: Nat) = (KnownNat n, IsSHA3Size n)
This way the error message will be a lot more helpful.
Update: botan 0.0.1.0 package candidate uploaded!
As promised, it’s here:
There’s still a good bit of work to be done before release, but I’ll be giving it the same level of love that botan-bindings
and botan-low
got while they were a package candidate - which was a lot.
First up, I’ve flattened the algorithm types for most operations (see issue #1 from earlier) with an overall positive result.
It is really visible in something like Botan.Cipher where we both have specific subsets that we need to deal with (Cipher vs AEAD), while also taking specific subsets of other operations as arguments (BlockCipher vs BlockCipher128). This new method is proving to be significantly better to use, so I’ll be finishing up and applying to the remaining operation / algorithm data types as necessary.
Secondly, I’ve decided to not elide the cryptography typeclasses from the 0.0.1 release (see issue #2 from earlier), but will still be postponing their continued development until after the 0.0.1 release, as to stay focused on the immediate needs - priortization!
The repo has been updated
Back in Action after getting some Rest
After getting botan-bindings
and botan-low
to the initial 0.0.1
release and botan
to package candidate status, I found myself in need of a bit of a break - both because I needed to pull back and get a bird’s eye view of it, and because I’ve been working on this project almost non-stop for 8 months now and everyone deserves a bit of vacation now and then.
Now that’s over, and it is time to get back in action! As this project transitions to the next phase of its life cycle, I’ve been writing an official Haskell Cryptography Group milestone blog post and call-for-users, taking into account feedback on the project and the round 2 funding proposal (which I am also still working on ). I’ll be discussing the history of this project and how it all started, what we’ve accomplished so far, what feedback we’ve heard / users want, and what flag we’ve planted on the horizon as our goal, as well as concrete details on the specific things we wish to accomplish next.
You can expect regular updates to resume Monday next week!
Minor note: The project has now officially transferred ownership to the Haskell Cryptography Group, and as planned, the repository been moved:
A more sizeable update and proper First Milestone blog is coming soon In the meantime, please enjoy this sneak peek at something I had fun implementing while on the break!
Just updating some old code to use botan-low
…
Returning from Absence.
I underestimated how much rest I needed, but in the words of Granny Weatherwax “I ate’nt dead”.
This last month has been half taking-a-break, half writing out the milestone blog and plans for the future, and half trying to figure out how to enable that future. For the moment, things are out of my hands, so instead, I’m going to do what I do best, which is the code, and update the devlog.
Things that have changed in the last few weeks:
- Cryptographic typeclasses have been temporarily moved to its own branch to focus on releasing
botan
- Minor improvements to utility functions and error handling
- Botan.RNG significantly reworked and cleaned up
- Botan.KDF, PwdHash, HOTP, TOTP are now release-ready and have tutorials and explanations
- Stable implementation errors in
botan_pwdhash_timed
are now handled properly - see […] confusing param mapping between pwdhash and pwdhash_timed - A minor update to
botan-bindings
andbotan-low
may be necessary to improve constants and organization of named elliptic curves, logarithm groups, and raw algorithm modes.
This isn’t pushed yet - I’ll let you known when it is.
Hey @ApothecaLabs, is there a good place to chat with you/the botan devs about development? I’m currently working through seeing how feasible it would be to swap crypton for botan in amazonka, and I’m running into a lot of issues with the design of the hashing API. I was wondering if there’s a good IRC channel or somewhere to chat - here feels like the wrong place but I can do that if needed.
Quick summary of the issues:
- There’s no pure, incremental hashing API (you probably say my GitHub issue), which makes writing things like pure streaming sinks difficult -
Monad m => Conduit ByteString () m (Digest a)
can’t currently be written, it needsMonadIO
instead (and with the current assumption that all hash state will be mutable updated, there could be bugs in how something that would get used, depending onm
).
As an aside here, linear types would make this API both pure and quite nice to work with - it probably shouldn’t be the only interface, but it would avoid the required copying of the context that a pure API would necessitate - The
Digest
data family is difficult to work with generically, it’s nearly impossible to write something that works on the wrappedByteString
s in functions with types likeHash a => Digest a -> ByteString
. I’d love to see a functiondigestBytes :: Digest a -> ByteString
added to theHash
class, at the very least. - I’ve also grown quite fond of
memory
'sByteArrayAccess
class and its ability to abstract types like digests soo they can easily be hashed too. I think that cryptonite got a lot write with its use ofmemory
, forcing everything to be aByteString
limits things a lot for production code.
I’m sure I’ll run into more, but this is where I am at the moment. I’m aware that botan
is unreleased, but wanted to at least investigate whether it’ll be usable in Amazonka. At the moment there isn’t a clear path that wouldn’t affect performance.
I should also mention that I’m happy to contribute the development of the changes I’d like to see - I should make a fork start playing around with ideas.
Since we’re on the topic of Amazonka, I should also say that I’d love to see botan
become the cryptography library that it rests on, but there are a few things beyond @Axman6’s exploration that I’d also like to see happen:
- The changes to support FFI really should make it into
botan
upstream, so that we’re depending on the same vetted code as everyone else. - The Haskell packages and the native library should have a good build/download story for all major platforms. I don’t want to accidentally cut off users on e.g., Windows if there’s no easy way to get them the native deps they need. (Also, AFAICT, Botan 3 is only in Debian experimental.)
-
botan-bindings
should not have to vendor any source code. I’m leery of packages which vendor source code, especially in the cryptography and security spaces.
These sound like pretty big asks to me, so I want to get them on the table now as a starting point for discussion. Maybe I’m being too dogmatic, or maybe they’re easier than I think, but let’s talk about them at least.
As already explained elsewhere: [Haskell Cryptography Group] Botan: The First Milestone - #14 by hasufell
3.3.0 is in debian experimental and oddly enough, windows is likely the least problematic platform.
I tried to create a botan-clib package, but the build system is a disaster: which source files to compile depends on platform and processor features that are detected during “configure”. You’ll have to map those things manually in the cabal file and create flags for all the cpu features, because the c source files used need to be encoded in the cabal file. And then review the build system changes on every version bump.
These are things that e.g. a gentoo or nixpkgs developer would be comfortable to do, but it isn’t at all trivial. And you lose the automagic detection of features.
Another way might be to abuse custom setup really hard, but that’s not gonna be very popular or easy either.
@Axman6 I am down for a video chat to discuss your needs in depth - just send me a private message and we’ll pick a time.
I have a lot to say in response to all this but saying it all would take time that I need to spend getting botan
updated / released. So you get fragmented notes in the name of expediency I hope they are not too insensible.
-
It is brave of you to be testing out the high-level interfaces when they are unstable
- Super glad you are interested
- I am trying to keep this in mind, but when I push the release (cross your fingers that its within a day or two) things may change (for the better I hope!)
- The focus has been on getting everything working in at least one way - then, we can deal with different interfaces
-
Data families vs witness / proxy
- I don’t mind providing more than one interface, I just happen to like data families
- Am planning on interfaces that also take a
alg
value witness orProxy alg
as an alternative, likecrypton/ite
- Am planning on interfaces that also take a
- Most other cryptography operations have multiple types that must work together so data families really works for them
- Witnesses are simple when you only have a single operation to apply them to
- Witnesses get hard when you have to apply them to every function in a multi-step operation
- Or when you need multiple or compound algorithms
- Your specific use case is the classic weak spot for data family type inference, eg when:
- Inferring from the return type as the only mention of the actual type as you cited
- Inferring a type that is not visible to the outer scope (eg,
digestBytes $ hash "foo"
)
- Hashing only has one related type, which is also produced as an output
- This means hashing in particular doesn’t work as well with data families, but it does work well with witnesses
- Being consistent overall is worth the tradeoff though
- We can still expose a witness / proxy -based interface too so this pain is temporary
- I don’t mind providing more than one interface, I just happen to like data families
-
There is a difference between incremental / online and mutable
- I’m not surprised you are having issues because I’ve been wanting to improve this are in particular - its confusing and so you should expect a lot of change here in the coming update(s)
-
Incremental
is really about online ie chunked processing, I want to reduce it to Strict vs Lazy ByteStrings so a typeclass may then be unnecessary -
Mutable
is really about stateful contexts- A
crypton/ite
-style mutable interface is planned
- A
-
Pure interfaces - only Hash has copyState, so specifically for hashes it is possible to make a stepwise pure interface out of a mutable interface, but for other operations it is not
- This is actually sensible if you stop to think about it
- Most cryptographic operations, copying state is highly undesirable or an outright information leak, so they don’t
- hashing is a rare exception as you may be hashing things sharing the same prefix, so there is a major efficiency gain
- The same is not necessarily true for other operations eg encryption and signing, especially anything using nonces
- The botan hash state is mutable, so I am highly averse to pretending that it is pure
- Need to guarantee that the IO actions are only run once each, and that they are run in the right order
- Can make a copy of the hash state for every ‘pure’ function, but this is inefficient - but it is possible
- Can pretend that its pure if we pretend that it is linear (ie, the result is only used once as an argument for the next action)
- Trying to avoid a class-splosion
- A similar issue regarding pure vs mutable interfaces can be seen in
random
withRandomGen
vsStatefulGen
- This is actually sensible if you stop to think about it
-
Regarding
memory
, I too liked theByteArrayAccess
class and friends, however there are issues that make me hesitant to include it as a dependency- The source repository has been archived by the author and is read-only
- It does not appear to be maintained (and so may warrant initiating a takeover process)
-
botan-low
has an absolutely minimal dependency footprint (aside from the botan library itself) - However, I have already been considering that it may be worth reviving those classes due to their utility
I hope that is readable and relevant. It will make more sense when I publish the code.
[
memory
] does not appear to be maintained (and so may warrant initiating a takeover process)
It is confirmed that none of Vincent Hanquez’ packages will be maintained, nor has he given an ok for taking it over. He’s ok with anyone forking it and using a new name, so that’s what’s been happening with crypton(ite)
and (crypton-)x509
etc.
Xeno’s Paradox Progress
I’ll keep this short; progress toward an update / release is ongoing; I feel as though every update we get about half the remaining work done, but discover more to do. However, an update is good, and I’m trying to not stress about it It has gotten a bit slow because now I might spend the entire day chasing down one specific operation on one specific algorithm in order to verify that it is well-behaved. The closer you look, the more detail there is to be seen.
Regardless, progress:
- All the
botan
modules are now 100% functional-complete - Significant reorganization of modules to reduce their number
- Lots of improvements to PubKey, including post-quantum Kyber / Dilithium algorithm parameters
- There are proper newtypes for everything, and so type safety is significantly improved
- Basic unit tests for
botan
are in progress, and are the major blocker for release now - Some massaging / improvements are still needed per catching exceptions and turning them into Either / Maybe when appropriate
- This bears pondering, as there are some failures where an exception is far more appropriate
- An example:
encrypt
is a general function for multiple algorithms; some algorithms have specific handling requirements for plaintext length and chunking, and failure to validate may throw an exception if the function is provided inappropriate arguments - this failure is different from failing to decrypt because of using the wrong key or auth and should return Nothing rather than throwing an exception. - This is important because otherwise we have to make a function like
encrypt
return aMaybe Ciphertext
even though encryption really shouldn’t fail the same waydecrypt
can. This would result in an un-ergonomic interface. - I am strongly leaning towards providing validation functions for input, throwing exceptions in the standard functions (and maybe an Either variant that doesn’t), and saving
Maybe
for functions likedecrypt
that are expected to fail in specific ways.
- All of the typeclasses / type family / algorithm-specific modules are still separated out for the moment, will need some minor merging back in, and may not make the initial
botan
release (as to better focus on the immediate needs)
I’m trying to get this out sooner rather than later, but it will require a minor update to the lower libraries, and I’ll need to make sure I don’t goof that up - I’m thinking I’ll bump botan-bindings
and botan-low
from 0.0.1
to 0.1.0
because I think there a few minor breaking changes, and release the first version of botan
as 0.1.0
.
What to do about memory
re: crypton/ite
?
I’ve been absent for some time due to the need to focus on personal health. It can be a struggle to reach out at times like these, but this last month I’ve not been idle. Herein lie the results of my investigation in the meantime.
In order for botan
to provide a palatable alternative to crypton/ite
, it is important to understand how people are using it, and what dependencies it implies. One of those dependencies is memory, which supplies a variety of typeclasses and data types regarding low-level pointers and memory. This package, like cryptonite
and by the same author, has been archived, and will not be updated in the future.
This is problematic, because even after forking crypton/ite
, many libraries still have a transitive dependency on memory
. The reason for this is that one of the prominent design choices was to use constraints instead of concrete data types in many of the typeclasses and functions.
For example:
class Cipher cipher where
...
cipherInit :: ByteArray key => key -> CryptoFailable cipher
...
class Cipher cipher => BlockCipher cipher where
...
ecbEncrypt :: ByteArray ba => cipher -> ba -> ba
ecbDecrypt :: ByteArray ba => cipher -> ba -> ba
...
Thankfully, most algorithm-specific outputs are wrapped in a concrete newtype.
hash :: (ByteArrayAccess ba, HashAlgorithm a) => ba -> Digest a
That does seem to be the general rule, that unstructured blobs of binary data be represented a ByteArray
, mostly plaintext
/ ciphertext
/ key
/ salt
/ password
-type data, but it can be slightly onerous when parameters are allowed to be heterogenous
bcrypt
:: (ByteArray salt, ByteArray password, ByteArray output)
=> Int -> salt -> password -> output
As a result, unless one is dealing exclusively with ByteString
, and / or the data types supplied by crypton/ite
itself, use of memory
is almost compulsory. Furthermore, if we wish to provide a similar interface using botan
as a backend, this means either accepting memory
as a dependency, or doing something about it.
As such, I have spent a lot of time familiarizing myself with its guts and how it is used. I really wanted to know the intent behind many of the decisions the author made (such as how the different allocation / create functions need to be used re: inlining and unsafePerformIO) because it doesn’t do to simply copy and paste. Here’s what I have learned:
The following libraries use both crypton/ite
and memory
, but there are many more.
amazonka-core
stack
tls
password
x509
One can look at the reverse dependencies for crypton/ite
and memory
to see the general cross-section. Thankfully most dependents of crypton/ite
found ByteString
and the supplied data types sufficient, and so they did not promulgate the issue by depending on memory
directly, only indirectly.
So, what is memory
?
It is a small, innocuous package, mostly exposing the following classes that allow ByteString
and primitive:Data.Primitive.ByteArray
-like access to pointers to contiguous memory:
-- Data.ByteArray
class ByteArrayAccess ba where
length :: ba -> Int
withByteArray :: ba -> (Ptr p -> IO a) -> IO a
copyByteArrayToPtr :: ba -> Ptr p -> IO ()
class (Eq ba, Ord ba, Monoid ba, ByteArrayAccess ba) => ByteArray ba where
allocRet :: Int -> (Ptr p -> IO a) -> IO (a, ba)
data Bytes
data ScrubbedBytes
-- Data.ByteArray.Sized
class (ByteArrayAccess c, KnownNat n) => ByteArrayN (n :: Nat) c | c -> n where
allocRet :: forall p a. Proxy n -> (Ptr p -> IO a) -> IO (a, c)
data SizedByteArray (n :: Nat) ba
These classes allow for fast and efficient manipulation of memory, and basically wrap up and classify a lot of low-level functions that GHC uses internally - incredibly powerful, and about as far as you can go without dropping down to using primitive
and MagicHash
directly.
Notably, the package does not distinguish between pinned and unpinned memory, though the unspoken assumption is that the memory pointed to in withByteArray :: ba -> (Ptr p -> IO a) -> IO a
will not be moved for the duration of that operation.
Aside from a few minor quibbles of hierarchy and nomenclature for purposes of consistency with Data.Primitive.ByteArray
and Foreign.*
and for better separation of bytewise functions vs elementwise functions, it is a useful set of classes. I see no reason to make large changes to it.
So, how is memory
being used
I do wish to understand how memory
is being used, and so I have gone over many of the direct dependents in order to collate their use cases.
A few particular use cases dominate:
-
ByteArrayAccess
andByteArray
classes -
convert
andconstEq
functions -
convertToBase
functions
Aside from that, there are several errata in the package that we must also consider. Notable:
ScrubbedBytes
-
SizedByteArray
andByteArrayN
ByteArrayAccess
and ByteArray
By and far, the most-oft used parts of this package are the classes themselves. The major user of these classes is the crypton/ite
ecology, and are the core of the package. Enough said - high priority.
convert
and constEq
If they aren’t using ByteArray
as a general ByteString
-like class, directly, chances are they using memory
in order to convert
between ByteArray
-conforming types. This is actually quite sensible, as many cryptography-related packages have some modules / classes / functions for converting various primitives / newtypes to and from bytestrings.
The other common use is to pull in constEq
for constant-time equality checks for sensitive information.
Quite often, the memory
package is pulled in for just for one or both of these. Between that and the classes themselves, we’ve covered the majority of memory
use cases
Base
and convertToBase
Seeing occasional use, memory
also exposes functionality for converting between various bases in the Data.ByteArray.Encoding
module. This is important, but not critical, because there are already packages for converting bytestrings to various bases.
SizedByteArray
and ByteArrayN
Seeing infrequent use (eg, NaCl
). Essentially a Nat
-parameterized ByteArray
, it might have been called FiniteByteArray
if being consistent with Bits
and FiniteBits
.
ScrubbedBytes
Seeing infrequent use (eg, NaCl
).
memory
supplies a ScrubbedBytes
class for sensitive memory that protects its Show
instance and is automatically zeroed before release. This is a really important but also niche use, and I only observed one or two packages using this functionality.
Here is how the zeroing of memory works for ScrubbedBytes
:
getScrubber :: Addr# -> State# RealWorld -> State# RealWorld
getScrubber addr s =
let IO scrubBytes = memSet (Ptr addr) 0 (I# sz)
in case scrubBytes s of
(# s', _ #) -> s'
Essentially, it is using extremely low-level RealWorld
magic to dynamically vend a memSet
function call in the finalizer of ScrubbedBytes
, in an effort to avoid the compiler eliding the zeroing due to the following free (that is, many assemblers will ignore memSet
if it is immediately followed by free
, because such a memSet
is normally considered to be a no-op).
Even if a compiler emits the instructions to zero memory in the compiled binary, that is no guarantee that it will be run, because the CPU may still optimize it out during runtime. As a result, most operating systems provide a specific OS-dependent function for this in order to ensure that the memory is zeroed, and the volatile
keyword and memset_explicit
function have only recently been properly standardized in C.
I cannot attest to the efficacy of this functionality regarding GHC specifically, but for reference, here is how Botan::secure_scrub_memory
works in C++
void secure_scrub_memory(void* ptr, size_t n) {
#if defined(BOTAN_TARGET_OS_HAS_RTLSECUREZEROMEMORY)
::RtlSecureZeroMemory(ptr, n);
#elif defined(BOTAN_TARGET_OS_HAS_EXPLICIT_BZERO)
::explicit_bzero(ptr, n);
#elif defined(BOTAN_TARGET_OS_HAS_EXPLICIT_MEMSET)
(void)::explicit_memset(ptr, 0, n);
#elif defined(BOTAN_USE_VOLATILE_MEMSET_FOR_ZERO) && (BOTAN_USE_VOLATILE_MEMSET_FOR_ZERO == 1)
static void* (*const volatile memset_ptr)(void*, int, size_t) = std::memset;
(memset_ptr)(ptr, 0, n);
#else
volatile uint8_t* p = reinterpret_cast<volatile uint8_t*>(ptr);
for(size_t i = 0; i != n; ++i)
p[i] = 0;
#endif
}
Replacement of this functionality is considered important, but as a low-level priority compared to the higher-priority functionalities.
Other errara
Outside of that, there are portions of memory
that see little to no actual use:
- Parser and Packer modules
- Endianness
- Word128
- The SipHash and FNVHash modules
I do not think we need to worry about these for the time being.
So, what do we do about it?
I think it would be wise to provide ByteArray
-like functionality in botan
for better compatibility with crypton/ite
and easier migration, and this assessment helps illustrate what capability needs to be preserved, and what can be ignored.
Thus I am considering creating a slimmed down bytearray-classes
package, that plucks out the necessary classes and functions in order to allow us to depend on it instead of memory
.
We definitely want the bulk of these modules, maybe ignoring the concrete data types and focusing only on the classes:
- Data.ByteArray
- Data.ByteArray.Encoding
- Data.ByteArray.Sized
- We also probably do want to make some data types a la
Bytes
andScrubbedBytes
Using bytearray-classes
Migration would ideally be simple, as I don’t intend on changing the interfaces much, but this actually leads to a rather thorny problem and important decision - where in botan
does this library get used? The answer to that is slightly frustating, and one that gives me no small amount of anxiety:
For improving compatibility / ease of migration, the best place to use bytearray-classes
and ByteArray
constraints like crypton/ite
is in botan-low
.
The reasoning here is that the concrete ByteString
functions can be obtained from the ByteArray
functions simply by restricting the types, and we would be able to make all of the internal logic and function generators for allocating ByteString
s much more consistent.
This of course depends on whether people actually prefer a ByteArray
constraint over a concrete ByteString
parameter at all, as you may find use of TypeApplications
suddenly necessary.
Regarding amazonka-core
: we only seem to directly use memory
for base{16,64}
encoding and it might be easy to remove. Issue
Ah, it is not, but I suspect this is a major cause of direct dependencies on memory
:
This is not straightforward: the various
Digest a
s that we get fromcrypton
can only be interacted with using theByteArrayAccess
interface frommemory
, which is whyamazonka-core
has polymorphic functions forbase{16,64}
encoding. And if we move tobotan
we’ll be able to get encoders from there anyway, so there’s little point in doing this refactoring twice.
The password
library also doesn’t really need (or at least want) to use memory
.
The convert
to Bytes
we have is only in there because benchmarks showed it is slightly more performant and if we’re depending on memory
already (because of crypton(ite)
), then why not.
The one we DO want/need is the constEq
for security reasons. Though I feel that is something that could also just be added to the bytestring
library. Then we’d use that one.
What to do about memory
, part 2: Secure memory erasure
Aside from the ByteArray
classes, there is another feature that I wish to extract from memory
, and that is the ScrubbedBytes
, or rather, the functionality that backs it.
What is ScrubbedBytes
?
I got into it a bit last post, briefly covering ScrubbedBytes
as one of the occasioanlly used features from memory
.
To quote the documentation:
ScrubbedBytes is a memory chunk which have the properties of:
- Being scrubbed after its goes out of scope.
- A Show instance that doesn't actually show any content
- A Eq instance that is constant time
Data types like these are used to avoid accidentally leaking sensitive data and / or personally identifiable information through means such as logs and core dump, and any production team worth their salt has a process set up to handle this (usually accompanied by / piped straight into analytics).
Breaking it down
- The memory is scrubbed after it goes out of scope.
When all references to a ScrubbedBytes
instance are gone, the finalizer runs, and the data will be explicitly zeroed out. This is great! Kind of…
But what does it actually do? Well, if we peek into the source code of newScrubbedBytes
, we see that ScrubbedBytes
is just an aligned, pinned byte array with a finalizer. The finalizer contains a dynamically-generated scrubber function, which eventually calls memset
on the memory region.
ScrubbedBytes.hs
:
newScrubbedBytes :: Int -> IO ScrubbedBytes
newScrubbedBytes (I# sz)
-- Elided zero-size cases
| otherwise = IO $ \s ->
case newAlignedPinnedByteArray# sz 8# s of
(# s1, mbarr #) ->
let !scrubber = getScrubber (byteArrayContents# (unsafeCoerce# mbarr))
!mba = ScrubbedBytes mbarr
in case mkWeak# mbarr () (finalize scrubber mba) s1 of
(# s2, _ #) -> (# s2, mba #)
where
getScrubber :: Addr# -> State# RealWorld -> State# RealWorld
getScrubber addr s =
let IO scrubBytes = memSet (Ptr addr) 0 (I# sz)
in case scrubBytes s of
(# s', _ #) -> s'
finalize :: (State# RealWorld -> State# RealWorld) -> ScrubbedBytes -> State# RealWorld -> (# State# RealWorld, () #)
finalize scrubber mba@(ScrubbedBytes _) = \s1 ->
case scrubber s1 of
s2 -> case touch# mba s2 of
s3 -> (# s3, () #)
PtrMethods.hs
:
memSet :: Ptr Word8 -> Word8 -> Int -> IO ()
memSet start v n = c_memset start v (fromIntegral n) >>= \_ -> return ()
foreign import ccall unsafe "memset"
c_memset :: Ptr Word8 -> Word8 -> CSize -> IO ()
This attempts to avoid leaking sensitive data in core dumps and memory (I believe it is attempting to use the crossing of FFI / module boundaries in between the memset and free to ensure it is not elided), and is functionality that needs to be preserved, but it’s just calling memset
from the C standard library, and it isn’t really exposed in a manner that I am comfortable with, either. In the absence of another mechanism, relying on GHC triggering GC to clean up sensitive memory is just begging for trouble.
Also, this is all done in an attempt to prevent the compiler from eliding the memset
because it occurs immediately before a free
. Many compilers will automatically remove any writes to a memory region that is about to be freed, which has been a significant thorn in the side of computer security, and has caused countless leaks - there are better options than memset
, more on that later.
I would prefer a bracketed withScrubbedBytes
-style function over relying on a data type with finalizers, but it is really the secure / guaranteed erasure capability that we are after, and using it in a finalizer is better than nothing.
Notably, this is something that cannot be handled in Haskell - it requires dropping down to C
.
- A Show instance that doesn’t actually show any content
When we print a value, we don’t want to show sensitive data or PII, so instead we sanitize it. It doesn’t stop you from directly accessing the raw contents and intentionally printing them, but it can save you from accidental exposure, eg if a debug statement makes it into production
This isn’t really specific to any data type or function or even language. Any language with a Show
or Printable
can override their debug / print / log instances to avoid any sensitive data from being exposed. Thus, preserving this functionality doesn’t require any special effort.
Perhaps instead of requiring the logic in Show
, it could be codified in a typeclass SecureShow
that provides a printing formatter that automatically elides secure sub-values via Generics or something. However, that requires then that there be a secure logging mechanism that utilizes it, and is a bit far out for the moment.
- A Eq instance that is constant time
When we compare sensitive data for equality, the goal is to report whether or not they are equal, and nothing more.
This requires special handling, because if you short-circuit the comparison, how long it takes to compare them may leak information - eg, if we stop comparing at the first byte because they are different, you know something about where in the data the difference occurs, and thus the implementation leaks more information than just ‘is equal or not’.
This is not especially problematic, as we simply need ensure that the entire data is compared before returning the result.
Additionally, constEq
is already part of the ByteArray
classes, and can be preserved from there.
Replacing memset
Now, with our properties in mind, we can really focus on the big gotcha, which is replacing use of memset
. But why? First, some history:
In the beginning, the Universe C standard was created without any concern for data security. This has made a lot of programs very insecure and been widely regarded as a bad move.
Compilers back then were not nearly as smart, and CPU design was much simpler; when you issued a memset
call, you could expect that that memory region would be set. Then compilers got smarter, and started optimizing. One of those optimizations is to discard any writes to memory that is about to be freed, and another optimization is to leave the content of malloc
uninitialized. You may see where this is going - all of a sudden, data that had been cleared out using memset
was reappearing in newly malloc
ed data. This is not a problem if calloc
is used, but it is slower, and this issue retroactively affected older code, and we’ve been paying for it ever since.
There have since been developed three general different ways of handling secure memory erasure:
- Adding a secure memset variant to the C standard, which is guaranteed not to be elided
- memset (C89)
- memset_s (C11)
- memset_explicit (C23)
memset
is the original function we are trying to replace. C11 added the memset_s
function for secure erasure.
memset_s()
can not be elided: K.3.7.4.1: Unlikememset
, any call to thememset_s
function shall be evaluated strictly according to the rules of the abstract machine as described in (5.1.2.3). That is, any call to thememset_s
function shall assume that the memory indicated bys
andn
may be accessible in the future and thus must contain the values indicated byc
.
Unfortunately, it was both improperly specified, and added as a part of Annex K, which was not a required part of the standard, and thus not widely implemented.
As a replacement for memset_s
, C23 specifies the function memset_explicit
, which should be preferred in new code. However, C23 is not yet finalized, and thus cannot be assumed to be available yet.
Thus, we should prefer to call memset_explicit
if at all possible. I’m not sure whether we should even try to call memset_s
, more investigation is needed.
- Calling an OS-specific secure memset variant
- explicit_memset (NetBSD, Oracle)
- RtlSecureZeroMemory (Windows)
- explicit_bzero (FreeBSD, OpenBSD, Linux)
- memzero_explicit (Linux)
Our secondary preference is to call an OS-specific function explicitly intended for securely erasing memory. There are a plethora of these, and we can select which one at compile-time by detecting the operating system.
We should prefer to call these if and only if memset_explicit
is not available.
- Using volatile function or memory pointers to avoid being optimized out
- calling memset on the memory region through a volatile function pointer
- writing to the memory region through a volatile pointer
- writing an extern value to the memory region
We should fall back on these if and only if no other solution is available. Notably, MacOS / clang
uses this fallback in botan
, with an accompanying note.
- Increasingly wild alternative methods
- using
mmap + MAP_ANONYMOUS + MAP_PRIVATE
- using a function from a widely-used library (eg OpenSSL, libsodium, botan) which does 1-3 for you
These can’t be relied upon, for various reasons. Also, mmap
is POSIX and MAP_ANONYMOUS
was just added in POSIX 8.
Doing it properly vs doing what we can
Doing it properly requires deeper support than just C- or ASM-level. Modern CPUs are incredibly complex, with multiple levels of cache to handle, and in some ways, the problem has just moved inside of the CPUs, where the internal microarchitecture may still JIT discard / optimize instructions out of the machine code unless the proper instructions are used. The real solution is for the OS to keep keys / sensitive data / PII inside of a secure enclave, but that’s OS & hardware level, and “Dont ever load it in local memory” is not a solution that is always available.
We can’t handle that here, so the methods described are what we can do at the language level - we do the best to provide a solution for when we DO have sensitive information in local memory.
secure-memory-erase
To that end, I’ve plucked out and stripped down the secure-memory-erase
code from botan C++
, and have been looking at / comparing it to similar solutions in other libraries like libsodium
. Right now it is a single C/C++
header + source file, which exposes a single C function, which is in turn foreign imported by an hsc
file.
secure_memory_erase.h
:
#pragma once
#include <stddef.h>
#include <stdint.h>
#ifdef __cplusplus
extern "C" {
#endif
void secure_memory_erase(void * mem, size_t bytes);
#ifdef __cplusplus
} // extern C
#endif
secure_memory_erase.cpp
:
#include "secure_memory_erase.h"
#include <stdio.h>
#if defined(SME_TARGET_OS_HAS_RTLSECUREZEROMEMORY)
// Windows: void RtlSecureZeroMemory(void * ptr, size_t n);
// https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-rtlsecurezeromemory
#define NOMINMAX 1
#define _WINSOCKAPI_ // stop windows.h including winsock.h
#define WIN32_LEAN_AND_MEAN // stop windows.h including lots of things
#include <windows.h>
#undef NOMINMAX
#undef _WINSOCKAPI_
#undef WIN32_LEAN_AND_MEAN
#elif defined(SME_TARGET_OS_HAS_EXPLICIT_BZERO)
// Linux, FreeBSD, OpenBSD: void explicit_bzero(void *b, size_t len);
// https://man.freebsd.org/cgi/man.cgi?query=explicit_bzero
#include <string.h>
#elif defined(SME_TARGET_OS_HAS_EXPLICIT_MEMSET)
// NetBSD: void * explicit_memset(void *b, int c, size_t len);
// https://man.netbsd.org/NetBSD-8.0/explicit_memset.3
// NOTE: C23 is memset_explicit, *NOT* NetBSD explicit_memset
#include <string.h>
#endif
namespace Secure::Memory::Erase {
void erase(void * ptr, size_t n) {
// TODO: If C23, just call memset_explicit like libsodium
// TODO: If C11 and supported, maybe call memset_s
#if defined(SME_TARGET_OS_HAS_RTLSECUREZEROMEMORY)
// Windows
::RtlSecureZeroMemory(ptr, n);
#elif defined(SME_TARGET_OS_HAS_EXPLICIT_BZERO)
// FreeBSD, OpenBSD, Linux
::explicit_bzero(ptr, n);
#elif defined(SME_TARGET_OS_HAS_EXPLICIT_MEMSET)
// NetBSD
(void)::explicit_memset(ptr, 0, n);
#elif defined(SME_USE_VOLATILE_MEMSET_FOR_ZERO) && (SME_USE_VOLATILE_MEMSET_FOR_ZERO == 1)
// MacOS
/*
Call memset through a static volatile pointer, which the compiler
should not elide.
*/
static void* (*const volatile memset_ptr)(void*, int, size_t) = std::memset;
(memset_ptr)(ptr, 0, n);
#else
// MacOS
/*
Cast to a volatile pointer and set each byte in a loop, which the compiler
should not elide.
*/
volatile uint8_t* p = reinterpret_cast<volatile uint8_t*>(ptr);
for(size_t i = 0; i != n; ++i) {
p[i] = 0;
}
#endif
} // erase
} // Secure::Memory::Erase
extern "C" {
void secure_memory_erase(void * ptr, size_t n) {
Secure::Memory::Erase::erase(ptr,n);
}
} // extern C
Secure/Memory/Erase.hsc
:
{-# LANGUAGE CApiFFI #-}
module Secure.Memory.Erase where
import Prelude
import Foreign.C.Types
import Foreign.Ptr
#include "secure_memory_erase.h"
foreign import capi safe "secure_memory_erase.h secure_memory_erase"
secureMemoryErase
:: Ptr a -- ^ mem
-> CSize -- ^ bytes
-> IO ()
Which solution actually gets used depends on the cabal os
flags - botan C++
uses a configure
script step to detect the OS on install, correlating it with an OS-dependent capabilities
list that gets turned into a set of CPP #ifdef
flags. We do the same, using the cabal os
instead:
if os(windows)
cxx-options: -DSME_TARGET_OS_HAS_RTLSECUREZEROMEMORY
if os(linux) || os(freebsd) || os(openbsd)
cxx-options: -DSME_TARGET_OS_HAS_EXPLICIT_BZERO
if os(netbsd)
cxx-options: -DSME_TARGET_OS_HAS_EXPLICIT_MEMSET
It is not properly tested, however, and we could potentially support a wider / more fine-grained range of operating systems (botan C++
includes a rather extensive list).
I’d potentially like to ditch the C++ in favor of pure C, as well as adding a flag to use botan C++
as a dependency instead of the included source, but so far, it appears to be working.
Ideally, there should be also a Scrubbed
or Secure
typeclass that requires a secure erasure function, thus making the actual solution more of a choice, and then we’d simply be providing a sensible default secureMemoryErase
function without getting in the way of other solutions. It may be a little convoluted, but this would allow botan
to depend on secure-memory-erase
while still using its own functions internally.
Between this and the bytearray-classes
that I’ve been sussing out, I think I have separated out and established the critical functionality that should make an update to botan-low
& botan
actually meaningful, but also more broadly applicable.
Hey everyone, it’s good to be updating again. I’ve been away for health reasons - I’m doing rather well lately but I very consciously have to take it easy in order to maintain that. I may be a bit quiet or slow to respond, but I’m here, and I’ll be taking my time as to make sure I don’t deplete myself again.