[ANN] hsrs -- Ergonomic Haskell Bindings for Rust

A recent pain-point I’ve had is generating Haskell bindings for calling Rust code. Unfortunately, none of the existing prior art was really as ergonomic as I wanted it to be.

I’ve recently released hsrs – an ergonomic Haskell bindings generator for Rust. The goal of hsrs is to mimic the interfaces of PyO3 and napi-rs, and be as ergonomic as possible.

hsrs allows you to take this code

#[hsrs::module(safety = unsafe)]
mod quecto_vm {

/// CPU register identifiers.
#[derive(Debug, PartialEq, Eq)]
#[hsrs::enumeration]
pub enum Register {
  /// First general-purpose register.
  Reg0,
  /// Second general-purpose register.
  Reg1,
}

/// An error produced by the VM.
#[derive(Debug, PartialEq, Eq)]
#[hsrs::enumeration]
pub enum VmError {
  /// Division by zero.
  DivisionByZero,
}


/// A tiny VM with support for addition.
#[hsrs::data_type]
pub struct QuectoVm { registers: [i64; 2] }

impl QuectoVm {
  /// Create a new instance of the VM.
  #[hsrs::function]
  pub fn new() -> Self { ... }

  /// Adds register `b` into register `a` (a += b).
  #[hsrs::function]
  pub fn add(&mut self, a: Register, b: Register) { ... }

  /// Divides register `a` by register `b`, returning an error on division by zero.
  ///
  /// Demonstrates `Result<T, E>` → `Either E T` mapping across the FFI boundary.
  #[hsrs::function]
  pub fn safe_div(&mut self, a: Register, b: Register) -> Result<i64, VmError> { ... }
}

}

and generate

-- | CPU register identifiers.
newtype Register = Register Word8
  deriving newtype (Eq, Show, Storable)
  deriving (BorshSize, ToBorsh, FromBorsh) via Word8

pattern Reg0 = Register 0
pattern Reg1 = Register 1

-- | An error produced by the VM.
newtype VmError = VmError Word8
  deriving newtype (Eq, Show, Storable)
  deriving (BorshSize, ToBorsh, FromBorsh) via Word8

data QuectoVmRaw

-- | A tiny VM with support for addition.
newtype QuectoVm = QuectoVm (ForeignPtr QuectoVmRaw)

-- | Create a new instance of the VM.
new :: IO QuectoVm
new = do
  ptr <- c_quectoVmNew
  fp <- newForeignPtr c_quectoVmFree ptr
  pure (QuectoVm fp)

-- | Adds register `b` into register `a` (a += b).
add :: QuectoVm -> Register -> Register -> IO ()
add (QuectoVm fp) a b = withForeignPtr fp $ \ptr -> c_quectoVmAdd ptr (let (Register a') = a in a') (let (Register b') = b in b')

-- | Divides register `a` by register `b`, returning an error on division by zero.
--
-- Demonstrates `Result<T, E>` → `Either E T` mapping across the FFI boundary.
safeDiv :: QuectoVm -> Register -> Register -> IO (Either VmError Int64)
safeDiv (QuectoVm fp) a b = withForeignPtr fp $ \ptr ->
  fromBorshBuffer =<< c_quectoVmSafeDiv ptr (let (Register a') = a in a') (let (Register b') = b in b')

hsrs will generate both the Haskell side and the necessary C FFI bridges in Rust. The way I achieved rich type-semantics across both implementations is through borsh which serializes types in the Rust-side of things, and then deserializes it on the Haskell end.

For a full example, I’d recommend you look at the QuectoVM example in the hsrs repo.

Quickstart

Mark your crate as a static library and add the hsrs crate:

[lib]
crate-type = ["lib", "staticlib"]

[dependencies]
hsrs = "0.1"

In your cabal file, add:

build-depends:
    hsrs >= 0.1 && < 0.2
extra-libraries: your_project
extra-lib-dirs: path/to/crate/target/release

As part of your compilation process, run the codegen:

cargo install hsrs-codegen
hsrs-codegen src/lib.rs -o YourProject.hs

And then use the bindings in your haskell code:

import qualified YourProject
main :: IO ()
main = do
  vm <- YourProject.new
  someResult <- YourProject.exampleFunction vm
  print someResult

Prior Art

hs-bindgen

A relatively popular project is hs-bindgen, https://github.com/yvan-sraka/hs-bindgen. My understanding for this crate is that only primitive C types are supported, which did not suit my ergonomics requirements. hsrs supports serializable value types, mapping between String and Text, Vec<T>[T], Result<T, E>Either E T, etc.

Purgatory

I stumbled upon Calling Purgatory from Heaven – https://well-typed.com/blog/2023/03/purgatory/ – after writing hsrs, which describes a similar approach to what hsrs employs. The system described in that article outlines two packages – foreign-rust, https://github.com/BeFunctional/haskell-foreign-rust, and haskell-ffi, https://github.com/BeFunctional/haskell-rust-ffi. From now, I will refer to these two packages as Purgatory. Similar ideas and differences are:

  • Both hsrs and Purgatory use borsh as the underlying serialization scheme for sharing value types across the FFI boundary.
  • hsrs, unlike Purgatory, automatically does Haskell codegen for you from your Rust types. hsrs automatically emits extern functions and automatically generates binding files. We support automatic .hs codegen and have some nifty features:
    • Automatic value-type serialization/deserialization.
    • Automatic Haddock codegen from your Rust codegen.
    • Automatic Derive propagation – things that you marked as Eq in Rust automatically get Eq in Haskell, etc.

Future Work

There are a couple things I see hsrs scaling into:

  • async – My current needs are exclusively synchronous, but I do see hsrs growing into adding support for async.
  • hsrs does not support stack allocation. Even for fixed-size types, hsrs allocates on the heap. This could be better improved for types of known sizes – the haskell side can allocate a buffer that hsrs can write into, if the wire structure is Sized.

Feedback is very welcome – I want hsrs to solve for your needs as well as it does for mine. I commit to supporting this project for the next year, or so, to the best of my abilities.

16 Likes

Always cool to see improved ffi!

My initial question is why use a newtype around a Word8 and pattern synonyms instead of actual enumerated types? You can still go via an integral interpretation in the boundary, and you’d be able to derive Enum and Bounded, as well as letting the Haskell side write safer code.

For example:

data Register = Reg0 | Reg1 | Count
  deriving (Generic, Show, Eq, Enum, Bounded)
  deriving (BorshSize, ToBorsh, FromBorsh) via (ActuallyAsEnum Register)

instance (Enum a) => ToBorsch (ActuallyAsEnum a) where
  {- define using toEnum and fromEnum -}

Or indeed using AsEnum from Borsh itself.

1 Like

Thank you for the reply and warm words!

It’s a little bit of debt and a hack, mostly done to unblock myself, coupled with a little bit of laziness and AI slop. I kind of want to completely rewrite the Sized codepaths, which would include better enum types.

In the v0.1 implementation of hsrs, I’ve decided to take the naive approach and use borsh exclusively for compound types, in which case we allocate objects on the heap. I realized that at some point I wanted a completely different code path for Sized types, (since we can completely avoid heap allocations), and decided to accept the C FFI types as they were.

There are two core things I want to land in 0.2 – avoiding heap allocation for Sized types, and, by extension, richer trivial types (including enum types).

3 Likes

I’ll keep an eye on this. I spent a while researching the alternative options for Haskell-Rust FFI a few months ago, and settled on using Mozilla’s cbindgen to create a header file, then passing that to Well-Typed’s new hs-bindgen tool.

I should really publish a writeup and demo project for that. Although it’d be nice if I could get it properly integrated with Cabal Hooks first.

6 Likes