Theseus: Worry free algebraic and higher order effects

Hi, I’m excited to share Theseus, an effect system library supporting algebraic effects (including nondeterminism and coroutines), higher order effects, guaranteed finalizers, and semantics that do not change regardless of how you order interpreters. The project’s readme (linked above) has a tutorial, a showcase, and annotated source files explaining how it works.

Copying a couple of those links directly:

  • A tutorial walking you through creating an effect for interacting with the terminal, testing the effect, creating a higher order effect, and understanding private and public effects.
  • A showcase providing an example of something that other effect systems can struggle with.

To make sure it worked in the real world, I ported an old project over to Theseus called the Daily Reporter. It scrapes the web and sends Matrix messages with what it finds. Since that migration was successful, I figured now would be a good time to share Theseus.

So how does it work? Theseus uses a higher order Freer Monad to merge algebraic and higher order effects, and it uses a new class called ControlFlow to manage finalizers and order independent interpretations. There’s more of an explanation here although I’d recommend going through at least the tutorial first. Theseus makes some trade offs to accomplish these goals. In particular the ControlFlow mechanism requires that Theseus modify the semantics of nondeterminism and coroutines a bit. Nondeterminism ends up running in more of a breadth first style, and suspended functions need to be used linearly (otherwise finalizers might get duplicated or skipped). Overall though I think these are useful trade offs to make! You lose out on some control flow flexibility, but in exchange you get easy resource management.

If anyone has questions about Theseus, I’d be happy to answer them! If anyone has ideas for improving Theseus, I’d be happy to hear those too!

8 Likes

Unfortunately I don’t think Theseus’s finalizers are guaranteed. Compare effectful, where Finalizer is printed:

ghci> :t finally
finally :: Eff es a -> Eff es b -> Eff es a
ghci> runEff (finally undefined (liftIO (putStrLn "Finalizer")))
Finalizer
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  undefined, called at <interactive>:18:17 in interactive:Ghci4

to Theseus, where it isn’t:

ghci> :t finally
finally
  :: ef Identity => Eff ef es () -> Eff ef es a -> Eff ef es a
ghci> runEffIO (finally (liftIO (putStrLn "Finalizer")) undefined)
*** Exception: Prelude.undefined
 
HasCallStack backtrace:
  undefined, called at <interactive>:20:51 in interactive:Ghci4

[N.B. Theseus takes its arguments to finally in the opposite order than the standard.]

It’s very hard for an effect system to have guaranteed finalizers unless it’s “analytic” (i.e. wraps IO, which Theseus doesn’t) in the sense I describe in A History of Effect Systems. I believe Heftia (synthetic, not analytic) claims to achieve guaranteed finalizers by forbidding IO in certain contexts, but I haven’t managed to understand Heftia very well.


Oh, and in case you were wondering, this has nothing to do with the body being bottom. The finalizer is missed on any RTS exception:

ghci> data Ex1 = Ex1 deriving Show
ghci> instance Exception Ex1
ghci> runEffIO (Theseus.Eff.finally (liftIO (putStrLn "Finalizer")) (liftIO (Control.Exception.throwIO Ex1)))
*** Exception: Ex1
 
HasCallStack backtrace:
  throwIO, called at <interactive>:29:72 in interactive:Ghci7
5 Likes

Wow, this is cool. It’s hard to combine algebraic and higher-order effects with a freer monad, but your ControlFlow trick for order-independent finalizers is smart. Breadth-first nondeterminism makes sense as a tradeoff. You just have to be careful when using suspended functions in a linear way, or else finalizers can run twice or disappear without warning. I’m interested to see how this works with complicated effect stacks. It might be worth stress-testing it with nested nondeterminism and async IO to see if the semantics really hold.

2 Likes

@tomjaguarpaw you switched arguments to finally.

Looks like a cool exploration project :+1:

How does an interpreter of an effect

data Fork :: Effect where
  AsyncWithUnmask :: ((forall a. m a -> m a) -> m r) -> Fork m (Async r)

that just wraps asyncWithUnmask from async look like?

Btw, did you experiment with delimited continuations? I personally consider free monad based solutions to be DOA for anything serious due to unpredictable performance on all fronts (time, memory usage and GC pressure), so I’m always wondering why people go there instead of trying to refine the path eff went.

3 Likes

Theseus’s finally switched the argument order, which is why I also printed out the types of each version of finally. To avoid confusion I should have called this out in my post, and will make an edit.

3 Likes

@tomjaguarpaw, yeah I should find the right way to qualify the “guaranteed” part of “guaranteed finalizers”. In this case guaranteed means the finalizer will run exactly once as long Theseus controls the control flow. I think this is cool because, even qualified like that, it’s not a guarantee plenty of algebraic effect systems can make. Koka, a language designed around algebraic effects, leaves the problem open. A very practical, useful, and valid answer is to disallow effects which don’t interact well with IO (I think that’s what the IO based effect systems do, but correct me if I’m wrong). With Theseus I’m trying a different direction.

With that said, the problem of IO exceptions in Theseus is solvable, but it requires some careful management at the edges. Whenever you call liftIO, you can wrap the IO in try and rethrow the exception on Left. Same for pure exceptions with a little unsafePerformIO thrown in. Maybe Theseus could default to doing this? That would require both IOE and Throw SomeException in the effect stack for Eff to implement MonadIO. I might have to experiment with that some. Pure exceptions would be similar but more invasive, so I’m hesitant to invest much time in them. I feel pure exceptions are also more likely to just abort the program making finalizers less useful.

@barb thanks, I’ll look into some more stress testing. I have some coroutine+nondeterminism and nested nondeterminism tests, but they’re pretty basic. Those are the less commonly used effects, so it’s harder to come up with realistic test cases for them. If you have some examples I can give them a try.

@arybczak For AsyncWithUnmask, you wouldn’t be able to just wrap the function from async because Theseus isn’t UnliftIO. Unfortunately that also makes masking and threading hard. Some kind of cooperative multitasking built on Coroutine might work well though. I imagine the concept of masking would still apply to that even if it’s not the IO kind. I’ll have to experiment a bit with it.

I actually started Theseus as the opposite of what it is now lol. It was ReaderT IO based but used delimited continuations to implement the missing algebraic effects. Unfortunately some of the other weird experimental stuff I tried in that version didn’t turn out as well. If I get past the experimentation phase with Theseus, I’d be interested in trying to port it to something with better performance characteristics. Currently it’s all based on really slow building blocks, but the building blocks are exposed to the type system in ways that unsafeCoerce and primops can’t be. That gives me more confidence in the experimentation.

1 Like

Aha, OK, nice!

Yes, that’s right.

OK, well, I suspect it’s easier said than done! But you know better than me.

Is it fair, then, to describe Theseus as a new step in algebraic and higher order effect research, but not as suitable for production use?

Design issues aside, I added theseus to effectful’s benchmark suite here out of curiosity to see how it performs and here are the results:

Countdown.1000
$ cabal run bench -- -p countdown.1000
All
  countdown
    1000
      reference (pure): OK
        1.94 μs ±  89 ns,  16 KB allocated,   2 B  copied, 6.0 MB peak memory
      reference (ST):   OK
        2.42 μs ± 178 ns,  16 KB allocated,   3 B  copied, 6.0 MB peak memory
      effectful (local/static)
        shallow:        OK
          3.64 μs ± 349 ns,  16 KB allocated,   4 B  copied, 6.0 MB peak memory
        deep:           OK
          4.00 μs ± 337 ns,  23 KB allocated,  15 B  copied, 6.0 MB peak memory
      effectful (local/dynamic)
        shallow:        OK
          15.5 μs ± 1.3 μs, 134 KB allocated,  40 B  copied, 6.0 MB peak memory
        deep:           OK
          15.9 μs ± 1.4 μs, 141 KB allocated, 122 B  copied, 6.0 MB peak memory
      effectful (local/dynamic/labeled/send)
        shallow:        OK
          15.7 μs ± 1.5 μs, 134 KB allocated,  41 B  copied, 6.0 MB peak memory
        deep:           OK
          16.0 μs ± 1.4 μs, 141 KB allocated, 123 B  copied, 6.0 MB peak memory
      effectful (shared/static)
        shallow:        OK
          15.9 μs ± 1.3 μs, 118 KB allocated,  29 B  copied, 6.0 MB peak memory
        deep:           OK
          16.7 μs ± 1.5 μs, 125 KB allocated,  96 B  copied, 6.0 MB peak memory
      effectful (shared/dynamic)
        shallow:        OK
          27.2 μs ± 2.7 μs, 236 KB allocated,  73 B  copied, 6.0 MB peak memory
        deep:           OK
          27.5 μs ± 1.3 μs, 243 KB allocated, 209 B  copied, 6.0 MB peak memory
      effectful (shared/dynamic/labeled/send)
        shallow:        OK
          27.1 μs ± 2.7 μs, 236 KB allocated,  72 B  copied, 6.0 MB peak memory
        deep:           OK
          26.8 μs ± 1.6 μs, 243 KB allocated, 208 B  copied, 6.0 MB peak memory
      mtl (effectful)
        shallow:        OK
          43.6 μs ± 2.6 μs, 236 KB allocated,  68 B  copied, 6.0 MB peak memory
        deep:           OK
          44.6 μs ± 2.7 μs, 243 KB allocated, 205 B  copied, 6.0 MB peak memory
      mtl (transformers)
        shallow:        OK
          53.5 μs ± 2.9 μs, 477 KB allocated,  90 B  copied, 6.0 MB peak memory
        deep:           OK
          418  μs ±  33 μs, 3.0 MB allocated, 875 B  copied, 6.0 MB peak memory
      fused-effects
        shallow:        OK
          81.6 μs ± 6.5 μs, 719 KB allocated, 123 B  copied, 7.0 MB peak memory
        deep:           OK
          690  μs ±  50 μs, 5.2 MB allocated, 1.3 KB copied, 7.0 MB peak memory
      polysemy
        shallow:        OK
          104  μs ± 6.3 μs, 1.4 MB allocated, 386 B  copied, 8.0 MB peak memory
        deep:           OK
          191  μs ±  17 μs, 3.6 MB allocated, 1.6 KB copied, 8.0 MB peak memory
      theseus
        shallow:        OK
          74.3 μs ± 5.5 μs, 1.1 MB allocated, 203 B  copied, 8.0 MB peak memory
        deep:           OK
          476  μs ±  35 μs, 8.4 MB allocated, 2.2 KB copied, 8.0 MB peak memory

All 24 tests passed (4.27s)

Looks as expected. Similar to polysemy, scales worse with effect stack size though.

Filesize.1000
$ cabal run bench -- -p filesize.1000
All
  filesize
    1000
      reference: OK
        721  μs ±  52 μs, 6.1 MB allocated, 298 KB copied, 7.0 MB peak memory
      effectful
        shallow: OK
          782  μs ±  34 μs, 6.7 MB allocated, 200 KB copied, 7.0 MB peak memory
        deep:    OK
          784  μs ±  45 μs, 6.7 MB allocated, 208 KB copied, 7.0 MB peak memory
      mtl (effectful)
        shallow: OK
          911  μs ±  87 μs, 7.3 MB allocated, 227 KB copied, 7.0 MB peak memory
        deep:    OK
          906  μs ±  50 μs, 7.3 MB allocated, 234 KB copied, 7.0 MB peak memory
      mtl (transformers)
        shallow: OK
          1.03 ms ±  99 μs, 7.6 MB allocated, 376 KB copied, 7.0 MB peak memory
        deep:    OK
          2.24 ms ± 185 μs,  15 MB allocated, 870 KB copied, 8.0 MB peak memory
      fused-effects
        shallow: OK
          1.73 ms ± 108 μs,  13 MB allocated, 722 KB copied, 8.0 MB peak memory
        deep:    OK
          3.21 ms ± 250 μs,  24 MB allocated, 1.1 MB copied, 8.0 MB peak memory
      polysemy
        shallow: OK
          2.11 ms ± 173 μs,  18 MB allocated, 1.2 MB copied, 8.0 MB peak memory
        deep:    OK
          2.34 ms ± 170 μs,  22 MB allocated, 1.3 MB copied, 8.0 MB peak memory
      theseus
        shallow: OK
          710  ms ± 8.3 ms, 6.6 GB allocated, 567 MB copied,  10 MB peak memory
        deep:    OK
          725  ms ±  24 ms, 6.7 GB allocated, 581 MB copied,  10 MB peak memory

All 13 tests passed (6.65s)

Something’s busted here. It’s orders of magnitude slower than other libraries and allocation results are, well, interesting.

7 Likes

I’ve now got a test showing how it handles exceptions from IO. The implementation is pretty much what I described: the liftIO function wraps the operation in a try, uses the Throw SomeException effect to propagate any Lefts, then uses throwIO after Theseus is done running to make it look like the exception passed through Theseus unharmed. Along the way though I discovered some pretty bad footguns related to finalizers that I’ve been trying to remove. It’s working now, but the implementation is scary enough that I think I’m missing something. I really need to try taming the complexity that the control flow constructs introduced.

Is it fair, then, to describe Theseus as a new step in algebraic and higher order effect research, but not as suitable for production use?

Yeah, I think that’s fair. I’d like it to be “production ready” eventually and I’ll continue dogfooding it for my own projects, but I’m not convinced yet that it won’t topple over. Especially since @arybczak found a pretty bad case of it toppling over already. Thanks for adding it to that benchmark by the way! I expect it to be slow, but you’re right that something is busted for it to be that slow. I’ll have to investigate what’s going wrong.

2 Likes