Botan Cryptography Monthly Status Report #0

Monthly Status Report

It has been just over one month since this project received funding from the Haskell Foundation, and that means it is time for a status report on how things are going!

Where we were then

This project has already been underway for some time, so it necessary to some numbers to where we were at the start of the month, in order to compare to where we are now.

Here are some stats - at the start of the month, this project had:

  • 154 commits
  • 119 days
  • 1 repo
  • 1 language
  • 3 libraries
  • 99 modules
  • 11191 significant lines of code

There was some manual counting plus find . -name \*.hs -exec cat {} +| grep -Evc '^\s*(--|$)' and git rev-list --count COMMIT, so the numbers are slightly subjective.

At the time, I had just finished wrapping up the existing, read-only X509 support, with bindings, low-level implementations, and unit tests of the base Botan FFI. This was the last currently-existing module, and completing it left botan-bindings and botan-low functionally- / alpha- complete, though there is still a need for refactoring various things for safety, consistency, and final nomenclature, as well as some potential code migration updwards / downwards as things settle.

All things considered, things were in a very good place. We had accomplished 3 of our goals already:

  • Bindings library
  • Low-level library
  • Unit tests <- we were here

Remaining major targets / goals remaining were:

  • Development of botan high-level library
  • CI
  • Tutorials
  • Documentation

Plus several high-value optional goals, which would require extending the C++ botan FFI interface:

  • Extended X509 support
  • TLS
  • Stream ciphers

Where we are now

With the library in a stable place, I thought it would be appropriate to tackle one of the larger unknowns, which would be the need to extend the Botan C++ library in order to supply missing FFI features and functions. At this point, I knew it was certainly possible, quite doable, but I had no idea how long it would take, or how much effort. Thus began this last month of work.

At first, I thought it would be possible to extend the FFI by writing an additional C++ layer, but I quickly ended up forking the Botan C++ library, as there was a limit to what I could do without updating the source code itself. This meant building Botan C++ from source, which in turn meant that I had to be able to point Botan-Haskell at the custom Botan-C++.

There are plans to submit these improvements back to the original Botan C++ library once they are appropriately complete

Once that was complete, it was off to the races!

The rest of the month was spent working on extending the X509 interface in both C++ and Haskell, while building a bridge between the two. Now it is one month later, and I have gotten quite comfy with C and C++ again.The extended X509 is now moderately functional, in Haskell and C++, and although its still in-progress, it too is in a very good place.

As a result of this work, we have several new features and modules:

  • Botan.*.X509.CA (Certificate Authority)
  • Botan.*.X509.CRL (Certificate Revocation List)
  • Botan.*.X509.CSR (Certificate Signing Request)
  • Botan.*.X509.Options
  • Botan.*.X509.Path (Path validation)
  • Botan.*.X509.Store

I would estimate that the X509 FFI to be at 60-70% completion, with a high degree of certainty that it will be completed according to schedule.

Working hands-on with the Botan C++ source for a month has started to give me some opinions on it. I’ve effectively doubled the amount of code in the Botan C FFI, but as a result, the existing FFI patterns have become unwieldy / laborious at this scale. I keep having to resist the urge to refactor the entire FFI for consistency, now that I’ve worked with it for a while, and know its idiosyncracies (cough one giant giga-header). I could do some rework to rely more on defines and templates in order to reduce the C++ code footprint, but since we already have most of it working, that seems better left off til later as a refinement.

Our stats for the end of the month are:

  • 234 commits
  • 149 days
  • 2 repos
  • 3 languages (Haskell, C, C++)
  • 4 libraries
  • 115 modules
  • 15326 significant lines of code (12917 Haskell + 2409 C/C++)

It is gratifying to see that I have kept a consistent pace.

Where we are going next

After from taking a pause to see where we are (and thus write this report), now it is time to plan a little of the next month. We’re definitely going to be finishing up the extended X509 support, and after that we’re going to do a consistency pass to tackle some cleanup-and-polish in the bindings and low-level libraries, as well as focusing on the build pipeline providing better README / installation instructions / CI.

As such, our timeline is roughly as follows:

  • Bindings library botan-bindings
  • Low-level library botan-low
  • Unit tests
  • Extended X509 support <- we are here
  • CI
  • High-level library botan
  • Tutorials
  • Documentation <- we need to get at least here
  • Test vectors
  • Stream Cipher support
  • TLS support

It is clear at this time that optional deliverables (extended FFI for TLS, stream ciphers) may not make it by the end of this proposal’s funding. We knew this going in that not all optional objectives would be achieved, but I am pleased to know that at least X509 will. Stream ciphers are small enough that they may be able to sneak in, but I suspect that implementing an FFI interface for Botan’s TLS functionality will require a specific effort comparable to that of implementing the X509 FFI.

That’s it for now! :partying_face:


You can follow the devlog for more frequent updates.

16 Likes

Love ittttt :tada:

Great work and thanks for the wonderful report. I’m always happy when I hear where the botan project is at :slight_smile:

And I very much sympathize with the “must… refrain… from… refactoring!” sentiment. It’s best to not, but I know the feeling of really wanting to. Keep it up :+1:

4 Likes

I might have missed this, but what’s the plan wrt. merging your x509 botan contributions into mainline? I think it’s important to plan ahead and save some time for addressing potential feedback from maintainers after PR is opened etc.

3 Likes

@ApothecaLabs could you please clarify CLOC numbers? At the beginning of the month there were “12917 significant lines of [Haskell] code” and at the end of the month there were “12917 Haskell + 2409 C/C++” lines, which suggests that Haskell library remained unchanged. Yet you also say that “the month was spent working on extending the X509 interface in both C++ and Haskell”.

3 Likes

I’ve been having a rather personally impactful week (not bad things, so please do not worry), so I appreciate your patience in that it is taking me some time to respond :slight_smile:


@arybczak

I want to open a please-look-at-this-but-dont-accept-it-just-yet PR as soon as next week.

There are many interdependencies within the X509 functionality and so part of the challenge of implementing the desired features has been figuring out what other features they rely on, and how much of that we have to implement to actually get things working. There’s a few more things that I consider must be working before I’ll submit the PR, but no serious blockers.

I haven’t gotten around to updating the botan issue yet because I’ve gone and solved and answered many of the problems and questions that I would have asked, but I really should post something to the issue thread regardless :grimacing: That must wait for tomorrow though.


@Bodigrim

Oops!

The correct totals are 11191 significant lines of code (Haskell) before, and 15326 significant lines of code (12917 Haskell + 2409 C/C++) after. I believe I accidentally updated the wrong value set after calculating the after-total, which explains why I had to calculate and update it a second time.

I have updated the status report with the correct total.

3 Likes

That’s great progress! But…

How much feedback and comments from others did you receive for the API you’re freezing now?

You realize that without documentation and tutorials the audience your bindings are exposed to is necessarily quite small - which means the API you consider stable and permanent could be unsatisfactory to potentially large number of people.

1 Like

The project is broken up into several distinct libraries already in order to help alleviate such concerns; care has been taken that if the highest-level library does not express the right abstractions, you are free to use a lower-level library and write your own bindings.

The shape of the lower-level libraries botan-bindings and botan-low are effectively dictated by the Botan C FFI itself, as they seek to be a 1:1 match to it if possible, and there’s not much choice to be had at that level.

On the other hand, the high-level botan library (that I’m now working on now that the low-level libraries are stable) is going to require that feedback, and I have been giving a lot of scrutiny to the ergonomics and interfaces of existing Haskell cryptography libraries (eg crypton, saltine, several others) to see how well Botan can be made to fit existing conventions, at least nominally.

Ultimately, we are constrained somewhat by how Botan exposes certain primitives. An example of such differences / constraints might be that a primitive operation takes a random IV / nonce in one library, while in another library that primitive operation obscures the nonce by taking an RNG context and generating the nonce internally. For example, Botan ciphers require a nonce, but pubkey encryption takes an RNG, and our hand is forced somewhat.


I have spent a lot of time thinking about what ergonomic and correct Haskell interfaces should look like. Something to consider is that the botan libraries might not be the correct place to express certain abstractions, even though we certainly could.

For example, I could create typeclasses like HashAlgorithm, Cipher, BlockCipher, etc typeclases from crypton (or implement appropriate instances of them), but the place for those higher-level / class-y abstractions is in a library that isnt attached to a particular backend like Botan. This project is specifically about Botan, and higher-level abstractions are necessarily out of scope so long as Botan-specific work remains.

These sorts of things aren’t forgotten, however; I’ve been working on a crypto-schemes library on the side as a sandbox for such higher-level cryptography. When we get to that point (using Botan as a backend for a more generalized cryptography interface) I’ll be seeking gobs of feedback.

4 Likes