Questionable compile time evaluation via TH lift

I am working on a project which compiles an expression language embedded DSL into a circuit/graph-like structure. The evaluation of these expressions is via a feed forward evaluation on the circuit, so if you were going to run it on multiple sets of inputs ideally you would compile the DSL expression only once.

I have binary codecs for the circuit types, so to construct an evaluation executable it’s easy enough to run the compiler as a one-time setup, write the output circuit to a file, and have the evaluator deserialize this file. You can even use TH to bake that file into the executable..

However, when I tried to use compile time evaluation via TH lift (which to me seems more natural as it doesn’t deserialize on every evaluation), I noticed that compilation was extremely slow and consumes an ungodly amount of memory. I’m not savvy enough to be able to profile this static evaluation, but I did spend a lot of time with the profiler when developing the compiler. I can definitely tell that whatever GHC is doing to statically evaluate the compile expression, it is way more intensive than if you were to run the compile phase separately – to the point of consuming all memory on my machine for large circuits that are easily compiled using the other method.

I don’t know a lot about TH and even less about what the static evaluation is doing, and it’s possible that I’m just doing something wrong. Does anyone know if there are some caveats/“gotchas” that come with lift? Is there any guide to avoiding them? Does anyone have any idea what profiling techniques I can use in this instance to figure out what GHC is doing?

1 Like

When you lift a large data structure, the compiler’s heap contains not only that data structure itself, but also the Exp which represents the whole Haskell syntax tree of that data structure, which is naturally many times larger than the original thing. And remember, this will become the input of the entire GHC optimization pipeline, so conceptually you’re compiling a .hs file with tens, hundreds or maybe more megabytes of size. So it’s normal to be slow and memory bound.

5 Likes

@TerrorJack Let me follow up with another question because I know you are an expert in this topic (also maybe @amesgen):

Say I’m going to abandon this partial evaluation strategy and just read the compiled program from a file for evaluation. Additionally (this is where it gets complicated) I want to do this evaluation in the browser via a WASI reactor module using the GHC WASM backend. You can assume the file is somewhat large, maybe on the order of 10s of megabytes uncompressed. I originally tried to follow the example laid out in browser-wasi-shim, you can find my version here

However, any time I try to read the file /circuit.bin or write to any file in /, I get some cryptic error trace like

Prover Error: Error: exit with exit code 1
    at proc_exit (http://localhost:1234/index.js:100865:23)
    at wasm-solver.wasm.__wasi_proc_exit (wasm://wasm/wasm-solver.wasm-00aa93ce:wasm-function[6594]:0x1111a5)
    at wasm-solver.wasm._Exit (wasm://wasm/wasm-solver.wasm-00aa93ce:wasm-function[6602]:0x11178c)
    at wasm-solver.wasm.exit (wasm://wasm/wasm-solver.wasm-00aa93ce:wasm-function[6621]:0x111f47)
    at wasm-solver.wasm.stg_exit (wasm://wasm/wasm-solver.wasm-00aa93ce:wasm-function[5816]:0xc902e)
    at wasm-solver.wasm.shutdownHaskellAndExit (wasm://wasm/wasm-solver.wasm-00aa93ce:wasm-function[5819]:0xc91f0)
    at wasm-solver.wasm.r2SE_entry (wasm://wasm/wasm-solver.wasm-00aa93ce:wasm-function[5617]:0xc342b)
    at wasm-solver.wasm.StgRun (wasm://wasm/wasm-solver.wasm-00aa93ce:wasm-function[6193]:0xedc45)
    at wasm-solver.wasm.scheduleWaitThread (wasm://wasm/wasm-solver.wasm-00aa93ce:wasm-function[5863]:0xcf762)
    at wasm-solver.wasm.rts_inCall (wasm://wasm/wasm-solver.wasm-00aa93ce:wasm-function[5801]:0xc8501)

I can verify that the file circuit.bin has been included in the WASI fs and can inspect the array buffer, so the file is there. Additionally, I can verify that this initialize callback is working correctly and the necessary exports are there, so it’s none of those sorts of things.

So my follow up question is, do you see anything obviously wrong with the way I’m trying to use the file system via the WASM host? Is there some optimal way to embed large blobs like this that is less finicky that also would work in non-browser hosts (e.g. via rust wasmer)?

For the record, the WASI reactor module works fine if I just recompile the circuit as part of the WASM program, but again for obvious reasons I’d much rather save the work and read it in from somewhere else.

My quick guess is there’s an uncaught exception which reached the top handler that will exit the program like this.

I know nothing about snarkjs or how it might invoke the wasm module though. If you can get JSFFI to work (requires 9.10/head) then there’s a way to dump some debug info directly via console API and bypasses the wasi layer, or you might want to double check if the wasi layer’s stderr indeed works properly and use that.

1 Like

You can read the file while pre-initializing with Wizer. We initially did something very similar to what you are doing here in Ormolu Live to read and parse the fixity database at runtime (ie when an end user loads the WASM file in their browser), but then switched (in this PR) to reading it from a file during the Wizer preinitialization as well as already parsing it, such that this does not have to happen for end users. Instead of just parsing, you can of course do arbitrary precomputation, so this might be interesting for your usecase. You also already have the basic scaffolding :sweat_smile:, see ghc-wasm-meta for more.

FTR: WASI stdout/stderr works when you add corresponding fds, ie like in browser_wasi_shim’s README:

  let fds = [
    new OpenFile(new File([])), // stdin
    ConsoleStdout.lineBuffered(msg => console.log(`[WASI stdout] ${msg}`)),
    ConsoleStdout.lineBuffered(msg => console.warn(`[WASI stderr] ${msg}`)),
    new PreopenDirectory("/", [
      ["circuit.bin", await load_external_file("circuit.bin")]
    ])
  ];

That might give you a better error message (and maybe some other things also assume that the first three file descriptors are actually backed by stdin/stdout/stderr, and not some file).

1 Like

Thank you so much for this answer, and also to @TerrorJack! Indeed there was something subtly wrong in my code that had nothing to do with the file system operations, but I was only able to figure it out with your suggestions and instructions for how to pipe to stdout/stderr.

As for that basic scaffolding, that’s coming direct copy/paste from ormolu live (which I think a lot of people are using as the default GHC WASM template :joy:). This comment about pre-initialization looks really promising, so I’m looking forward to trying that now that the basics are working.

I really appreciate the work ya’ll are putting in, it’s amazing that all of this stuff is working so well. Pretty much nothing I’ve been doing would have been possible until very recently so keep up the good work!

2 Likes