Bluefin-algae, algebraic effects in Bluefin

Lysxia · May 4, 2024, 5:02pm

The newly released bluefin-algae is a library for algebraic effects in the Bluefin effect system.

Algebraic effects are a primitive for user-defined effects. For the price of one computational effect (delimited continuations), get all of them! The README gives examples of concurrency and backtracking that would seem to require some fancy monads, but it’s just IO underneath with nothing other than the delimited continuations primitives.
Bluefin enables a particular flavor of effect handlers: “named effect handlers” (or “statically scoped effect handlers”) which have been discussed in recent literature (see references in the README), but as far as I can tell, more experimentation is necessary to better understand how that compares in practice to the dynamic scoping of effect handlers in most other systems. (One might hope that some of the reasons for preferring static scoping of variables also apply to the scoping of effects.)
One flaw of this implementation is that effectful operations traverse the stack to capture the continuation. Are there algebraic effect systems that avoid this issue? To give a non-example in case you’re wondering, OCaml Multicore, in spite of having a fancy call stack structure, must still do a linear search for a matching handler (this is arguably the fault of the dynamic scoping of effects!). For another non-example, effectful and bluefin don’t deal with delimited continuations (they are not “algebraic effect” systems).

ajbarber · May 5, 2024, 8:21am

Is this the first effects library to use the delimited continuations primitives?

tomjaguarpaw · May 5, 2024, 8:07pm

This is great to see! It brings me back full circle to my Reddit comment at the beginning of last year where I responded to your blogpost which presented a delimited continuation API with my own design. Bluefin’s API turned out to be almost that, with only minor changes! It was also the thread where I realised (thanks to @arybczak) that I could implement the API I wanted in terms of more familiar, basic primitives rather than delimited continuations.

I’m curious about this idea of represented delimited continuations as a GADT. It’s similar to polysemy, effectful and cleff. But is it really necessary. I find the “direct style” much easier. For example

data Filesystem m = MkFilesystem
  { readFileFS :: FilePath -> m String,
    writeFileFS :: FilePath -> String -> m ()
  }

rather than

data Filesystem :: Effect where
  ReadFile :: FilePath -> Filesystem m String
  WriteFile :: FilePath -> String -> Filesystem m ()

I wrote up how to direct style effects in cleff. I assume the same is possible in other effect systems too. But is there some benefit to the GADT style that I’m not seeing?

ArturGajowy · May 5, 2024, 10:12pm

To me it seems easier to achieve the following with GADTs:

logging / tracing invocations of effects
fuzzing / state machine testing
program rewrite / optimization

Other than convenience though, shouldn’t the initial and final encoding be equivalent in power?

Ambrose · May 6, 2024, 12:21am

It feels to me that whether you do GADT or record-of-functions, you’re gonna want TH to handle the boilerplate.

Like when I use cleff, the GADT is barely relevant to me. So this feels like a syntax-level difference maybe?

arybczak · May 6, 2024, 7:23am

But is there some benefit to the GADT style that I’m not seeing?

No need to invent the prefix/suffix for the field selector since constructors start with upper case letters.
You can force strictness for all interpreters for a certain operation.

On the topic of the library, this is a good example why delimited continuations need to be handled with care.

The fact that the API of the library permits you to circumvent the behavior of bracket and cause resource leaks by accident is a huge red flag.

tomjaguarpaw · May 6, 2024, 9:15am

Could you explain why? I don’t understand. What exactly is easier?

I presume so, but that’s my question! Why go indirectly through the thing that you don’t want if you can do directly through the thing you do? Is there something missing from the direct approach?

Sort of, but I think the GADT encoding is a greater conceptual difference from what one actually wants to achieve. For example here’s a “pure filesystem” implementation that uses the GADT encoding (from the Cleff docs). You want a function which reads a file? Attach it to the corresponding constructor of the GADT, which acts as a “hook”, get the parameter out of the constructor, and then define the function that reads a file, using that parameter.

filesystemToState = reinterpret \case
  ReadFile path -> gets (lookup path) >>= \case
    Nothing       -> fail ("File not found: " ++ show path)
    Just contents -> pure contents
  WriteFile path contents -> modify (insert path contents)

and here’s the same but with the direct encoding (from my PR). It seems conceptually much simpler to me. You want a function which reads a file? Just define a function which reads a file.

filesystemToState = reinterpretDirect
  MkFilesystem
    { readFileFS = \path ->
        gets (lookup path) >>= \case
          Nothing -> fail ("File not found: " ++ show path)
          Just contents -> pure contents,
      writeFileFS = \path contents ->
        modify (insert path contents)
    }

By way of evidence that I’m not arbitrarily deeming one approach “simpler”, I started down this line of thinking after watching the talk “Wire all the things!” by Eric Torreborre (Lambda Days 2023) where he says of effect systems

You have to introduce a separation, you cannot use functions directly any more, you have to use data types that are representing your functions [GADT style]. And at large, that makes the code a bit hard to navigate.

I guessed that wasn’t true – you don’t have to use “data types that are representing your functions” – and lo and behold, it isn’t!

ArturGajowy · May 6, 2024, 10:21am

Let me elaborate:

logging / tracing invocations of effects

with GADT’s you just need a Show instance for your GADT - which is easier to obtain than the TH needed to wrap a record of functions with invocation logging

fuzzing / state machine testing

you need the command ADT anyway, so the interactions and/or interleaves can be generated using Arbitrary.

program rewrite / optimization

is just value transformations rather than rewrite rules.

IOW, all those aspects are made easier by the invocation being reified.

Regarding:

Why go indirectly through the thing that you don’t want if you can do directly through the thing you do?

to me, once I see the need for using an effect system, I actually do want all of the above convenience, and hence GADTs. IME, losing the ability to introspect the program from within itself (by using final encoding) leads to more distracting and error-prone boilerplate.

Lastly, I have a slight preference towards the less syntax needed for expressing the initial encoding, as seen in your example

tomjaguarpaw · May 6, 2024, 11:04am

I see. Thanks for the explanations.

To be clear: they are not alternatives! When you have one, you have the other. They are interdefinable.

I’m just curious why all effect systems so far (except Bluefin) have prioritised the GADT style (and have not even explained how to use the direct style even though they do support it, as far as I can see). If GADT style was better for performance then I could understand. But is it? I do wonder whether it’s simply a case of some early effect system taking that approach and everyone else copying.

Lysxia · May 6, 2024, 8:39pm

A handler as forall x. E x -> IO x, where E is a GADT, is indeed equivalent to a record of function with one field per constructor of E. But “algebraic effect handlers” are of the form forall x. E x -> (x -> IO r) -> IO r. It wasn’t clear to me that there is still an equivalence with forall x. E x -> IO x/records-of-functions. That’s the main reason I stuck with GADTs in bluefin-algae. Well, you could have a record of functions with the extra (x -> IO r) parameter, but that doesn’t make things simpler for anyone.

Now that you make me think about the record-of-function representation again, a function forall x. E x -> IO x (equivalently, a record of functions) can express a function forall x. E x -> (x -> IO r) -> IO r by capturing the continuation. You can write this:

\(f :: forall x. E x -> (x -> IO r) -> IO r) ->
  (\(e :: E x) -> control0 tag \k -> f e k) :: forall x. E x -> IO x

(Some extra work needed to translate this snippet from IO to an Eff monad that has an explicit effect row, and to come up with the tag, but I think it’s feasible.)

This seems like a promising direction because those handlers forall x. E x -> IO x, or records of functions if you prefer, are compatible with Bluefin without delimited continuations, and they can be reused to do stuff with delimited continuations as the above shows. A slight cost to this approach is that it exposes reset and control0 (currently they are in bluefin-algae but only as an implementation detail).

From the point of view of the theory of algebraic effects, we only care about handlers of the form forall x. E x -> (x -> IO r) -> IO r, i.e., every operation immediately captures its continuation. Often that type isn’t even mentioned, its structure is hard-coded in the syntax of handlers, that’s just how some people think about algebraic effects: as an exception-matching construct with a captured continuation, or “resumable exceptions”. You can also think of a handler forall x. E x -> (x -> r) -> r as a fold of Freer E r. I think that’s why GADTs are so prominent in Haskell libraries inspired by algebraic effects.