Serializing Haskell functions to disk

jaror · May 22, 2024, 9:14am

Every data type you define is typeable and I believe all built-in types too, but existentially or universally quantified types are not.

chrisdone · May 22, 2024, 3:34pm

You could fork hell, which is 1.2k lines of code in a single file and gets you a mini-Haskell for scripting. It’s pretty easy to configure what types are supported, and what primitives, both monomorphic and polymorphic. It has some basic type clases like Eq, Ord, Show and Monad, but you can’t write your own within the object language.

I kept it intentionally one file to artificially limit the size of the implementation, and to make it easy for someone to fork and re-use for another purpose.

You could generate an untyped AST in haskell-src-exts format, and that’s easy to serialize to/from disk. [The subset of] Haskell syntax is stable over time, unlike some binary format which is likely to lead you to trouble. GHC’s API and general infrastructure is also fast-moving and wobbly, so I would never use it in a deployed app. If you’re always planning on generating code that has type annotations, then you probably don’t need type-inference and could cut out that whole block of code and bring it down to 500~ lines.

Performance-wise, a basic fib implementation outperforms GHCi slightly, which isn’t a brag or rigorous, but does indicate that its performance isn’t bad. Which makes sense, the evaluator would fit on a napkin and doesn’t really do anything.

One small: if performance or distribution becomes a problem, you can generate Haskell code from a Hell AST and compile it with GHC (haskell-src-meta: Parse source to template-haskell abstract syntax.), as it doesn’t “add” any features that Haskell doesn’t already have, and output WASM down the line if needed. That adds the burden of depending on the GHC toolchain, but it’s a path forward nonetheless.

ocramz · May 22, 2024, 4:41pm

more to the point, do things like Vector a → a have a Typeable interface? I suspect we’re talking of trainable models in the machine learning sense here.

jaror · May 22, 2024, 5:50pm

That’s a great practical example and it shows what I mean, namely the function and vector are typeable, but the polymorphic a is slightly problematic. Instead you’d have to use Typeable a => Vector a -> a.

enobayram · May 24, 2024, 5:21am

Based on the description of the problem, I would personally look into defunctionalizing your Strategy. I.e. instead of serializing the function, serialize whatever data you used to construct that function. After all, you’re not reading Haskell code from an input field in the UI or from a database, serialize the thing you’re reading or maybe serialize an intermediate stage between the raw input and the eventual Strategy.

Independent of the language used, serializing functions is very messy conceptually (except maybe for languages like C where functions can’t have closures).

I think of defunctionalization sort of as an incremental way to implement a DSL. Initially, there’s only one Strategy, so your DSL consists of the grammar (), i.e. that one Strategy is the only thing that’s expressible:

theOnlyStrategy :: UTCTime -> Bid
theOnlyStrategy = the . winning . formula

data DSLv1 = TheStrategy

toStrategy :: DSLv1 -> Strategy
toStrategy TheStrategy = theOnlyStrategy

You add 2 more Strategys in the next release, one parameterized over an integer and another over a boolean, then your DSL is represented as S1 | S2 Integer | S3 Bool and you have a simple migration from the first version of the DSL and you have a new toStrategy from DSLv2:

migrate :: DSLv1 -> DSLv2
migrate TheStrategy = S1

toStrategy :: DSLv2 -> Strategy
toStrategy S1 = theUsedToBeOnlyStrategy
toStrategy (S2 i) = ...
toStrategy (S3 b) = ...

This would give you the following benefits:

You have complete introspection into serialized strategies, you can display them, design domain-specific UIs for modifying them, statically analyze them and even translate them to an SMT solver and prove properties about them.
New releases can improve existing strategies or fix bugs in them
It’s just simple Haskell, no need to worry about unspeakable horrors involved in serializing code + runtime closures.

LaurentRDC · May 24, 2024, 12:29pm

That seems to be the best approach for now. Thank you!

ocramz · May 25, 2024, 2:46am

also from an ML angle, if the strategy belongs to a parametric family of functions (e.g. it’s a polynomial of a fixed degree), you only need to serialize the coefficients so the closure only effectively needs to exist at runtime:

eval :: DSLv3 -> (UTCTime -> Double)
eval (S3 i1 i2 i3) t = i3 + i2 * t + i1 * t**2

brandonchinn178 · May 31, 2024, 1:25pm

Potentially of interest:

darkxero · June 1, 2024, 2:57am

By what point do you figure you’re implementing a Lisp with Haskell?
Might as well use Common Lisp with Coalton.

ocramz · June 3, 2024, 3:25am

Bartos has this art school teacher vibe : "lambdas are closures mmmkay?

enobayram · June 3, 2024, 7:14am

I think that’s the C++ programmer in Bartosz speaking. When he says “lambdas are not named functions”, he means named functions in the C++ sense, where they can only be top-level. And when he says lambdas are closures, he means it in the sense that C++ lambdas get turned into a closure object if they have a non-empty capture clause. So I think that bit is him talking to his inner C++ that he can’t just pass around function pointers (which are effectively StaticPtr (a -> b) in Haskell land)

Topic		Replies	Views
Facilitating Cloud Haskell use and development	9	1697	September 7, 2024
Issue 155 :: Haskell Weekly Links	0	936	April 22, 2019
Papers on Distributed Systems with Haskell and Maths Learn	7	1506	September 20, 2023
Issue 146 :: Haskell Weekly Links	2	669	February 23, 2019
[Dream] Toward another compiler? Learn	28	4349	October 2, 2023

Serializing Haskell functions to disk

Related topics