Persist 1.0.0.0 released

Persist is a binary serialization library with two aims

  1. Have a default serialization format that can simply be derived for most types using Generic
  2. Allow a type to have a custom serialization format that matches an external specification.

Most of the improvements with this release were for goal 2. There is now support for backpatching a length value and avoid a 2nd data-structure walk. This is useful with tag-length-value formats where the length is a fixed number of bytes. It is also a source for some performance improvements with the default format for the builtin list datatype. There is additional support for de-serializing these formats as well with the new getPrefix unsafeGetPrefix helpers.

The API is similar enough to binary and cereal that it should be easy enough to try out the library.

13 Likes

Generic

inb4 compile time fearmonger pearl clutching (lmaooo)

On a serious note tho

Includes utilities to match externally specified data formats.

I couldn’t find this in the module haddocks. What does this mean exactly? Could I use this library to handle a binary format like UBJSON? or is that not what “externally specified” means here?

That is the kind of thing meant by “externally specified.” I intend to use these new features in the cql library to match the cassandra encoding.

For example, all of the numbers in UBJSON are written in big-endian format. You would want to use putBE for them. The default is little-endian if you use the default via Generic.

The API bits I would consider useful for external formats are:

  • putLE/putBE
  • reserveSize/resolveSize*
  • getPrefix

There is currently no lookahead or backtracking, so depending on the binary format it may be difficult to make it fit. Lookahead would be relatively easy to implement with the exposed internals.

1 Like

Can you add instances for Generically (Programming & Proving by Jan van BrĂĽgge).

3 Likes

The defaults rely on Generic, but can be easily overridden. If there were a PR that explained why Generically was useful, I’d be happy to merge and re-release.

@Iceland_jack 's linked article makes a compelling case, IMHO:

In my opinion there are two issues here: 1) the reliance on DeriveAnyClass which introduces a footgun for the user and 2) DefaultSignatures force the author to provide only one way to derive instances

It also means that you can do things like using Generically as the newtype for “construct instances in the obvious way, by taking advantage of the structure of the type” but if you (for example) introduce newtypes for big-endian or little-endian encodings, it makes the overall library surface more ergonomically uniform.

Also, have you considered using an indexed monad to track the number of open holes that the user needs to go back and patch? With -XQualifiedDo this is much more ergonomic than it used to be, and you can then prevent the user from running a putter unless the putting code goes back and fills out all the space that it reserves.

I’ll consider using Generically when I drop support for GHC 9.2

3 Likes

Indexed Monads or linear types would be useful to track this. Even without the guard rails it’s useful, and it’s unlikely to cause a problem. The code ends up looking like this:

  put l = do
    sizeHandle <- reserveSize @Word64
    go sizeHandle 0 l
   where
    go sizeHandle !n [] = resolveSizeLE sizeHandle n
    go sizeHandle !n (x : rest) = put x >> go sizeHandle (n + 1) rest

or something even simpler:


putWithLength x = do
  sizeHandle <- reserveSize @Word32
  putWithoutLength x
  resolveSizeExclusiveLE sizeHandle
1 Like

You can use package generically to get a compatibility shim (which re-exports the newtype from base when available, so it’s safe to unconditionally depend upon).

Good call on linear types - I had forgotten about them but they’d be much easier to use and probably more performant to compile than the typelevel gymnastics needed to track it all in an indexed monad. And if you’re only targeting GHC >=9.2, you should have workable linear types available to you.

4 Likes