[ANN] A series of articles on Heftia: The Next Generation of Haskell Effects Management

sgraf · May 22, 2025, 9:42am

On the other hand, if we know statically that only tail-resumptive handlers will ever be used, we can rewrite the code as follows:

Indeed, this is exactly what is impossible if you compile prog only once, before compiling runProg.

However, in the end, despite the impact on binary size, I think we can simply generate all three definitions and let the handling site’s code generator choose which one to use.

Yes, for this example, 3-specialisations-up-front seems like a viable route to take. However, if you have multiple effects, you are susceptible to exponential code bloat, because for n effects you need 3^n specialisations to cover all handler combinations up-front.

ymdfield · May 22, 2025, 9:54am

That applies to the third case, but shouldn’t the first two not blow up? Since allowing non-tail-resumptive handlers for only a subset of effects is impossible (for the purpose of converting Freer into ReaderT IO), the only possible combinations are whether all effects are tail-resumptive or not.

tomjaguarpaw · May 26, 2025, 9:38am

I realised that I can implement the Bluefin equivalent of ListT/LogicT as

newtype Logic es a =
  MkLogic (forall e. Stream a e -> Eff (e :& es) ())

so I’m very close to declaring that Bluefin supports logic programming/backtracking. I’m still searching for an essential use case of LogicT.

(This is what pipes does to define ListT: Pipes)

ymdfield · May 27, 2025, 12:39pm

At first, I mistakenly thought that the Logic type was a kind of compound custom effect (see the exchange on X: https://x.com/ymdfield/status/1927319998529204478), but I see now, it’s about introducing a new Logic monad on top of Eff. I think that’s a nice approach.

tomjaguarpaw · May 29, 2025, 9:41am

What I wrote works, but it’s not actually much different from LogicT (in fact I think you can get all its behaviour from LogicT (Eff es) so it seems to be redundant). It doesn’t allow you to run Bluefin handlers within a LogicT block, so you are stuck with something like StateT s (LogicT (Eff es)). You can’t use Bluefin to replace that StateT.

The reason is that the state should “reset” at every branch point. There’s no direct way of achieving that if your state is a mutable reference. Here is an example of the desirable behaviour:

import Control.Applicative ((<|>))
import Control.Monad.Trans.Class (lift)
import Control.Monad.Trans.State (StateT, modify, runStateT)

testProgram :: StateT Int [] ()
testProgram = do
  () <- lift (pure () <|> pure ())
  modify (+ 1)
  () <- lift (pure () <|> pure ())
  modify (+ 1)

-- ghci> example
-- [2,2,2,2]
example :: [Int]
example = do
  (_, s) <- flip runStateT 0 $ do
    testProgram
  pure s

(This is also the first time I understand what the point of LogicT is: it’s not how it interacts with other effects per se, but rather about how it interacts with other handlers. I don’t believe there’s any way of getting the desired behaviour in an IO-based effect system (unless that system is also using RTS delimited continuations itself, which is another can of worms). Thanks to @ymdfield for explaining this to me.)

ocharles · May 29, 2025, 9:57am

A big part of LogicT is msplit, which is the foundation of fair enumeration. Does your bluefin version still have that?

tomjaguarpaw · May 29, 2025, 10:04am

Not msplit literally, but consumeStream should give the needed functionality (observe one result and delay observing the rest until later). If it doesn’t do so directly then at least some gnarly code with threads and MVars (as in how consumeStream is implemented) could do the job.

sgraf · May 29, 2025, 12:35pm

Perhaps you want something like Lean’s Backtrackable? See Varieties of Monads

tomjaguarpaw · May 29, 2025, 1:38pm

Yes, that’s the kind of behaviour needed, but implemented in Bluefin. HandleReader can be used to override handles locally, so I think some use of that might work, but there’s still a lot of plumbing involved.

cdfa · May 30, 2025, 5:00pm

Thanks all for this fascinating discussion! I’d like to add some notes based on my experience working with heftia.

I’d always stayed away from effect systems because of the rumours I heard about strange behaviour in edge cases and when effects interact. But then at work a project came up where it would be very useful to define my own effects, and I became convinced that heftia’s “algebraic nature” would provide the least probability of such strange behaviour.

With “algebraic nature”, I mean that heftia’s effects and core datatypes are pure Haskell data types and there is no reliance on IO. A program simply constructs and deconstructs these datatypes. Compare this with effectful’s unlifting strategies. Having a choice of something that is not quite trivial to grasp possibly lead to a runtime error is a big drawback for me.

Using heftia to do the thing I wanted to do turned out to be a little more difficult than I expected because of the restriction that higher order effects need to be interpreted before first order effects (in general), but in the end I was able to work something out that’s slightly more flexible. Although I didn’t need to make use of heftia’s internals for this, the fact that it has an open core makes it much more inviting to try stuff like this.

Performance was not a big concern in this project, but this thread does have me worrying about it more. I do think heftia could be a little more honest about it (at least in the README), and I’d love to see:

More statements that compare heftia’s performance to a pure IO reference point, instead of other libraries. In the end, this is what matters.
A memory benchmark.

I’m not a big fan of adding a GHC plugin to rewrite the program to eliminate the effect datatypes. If there is a bug in the rewriting, it’s going to be very hard to figure out. I know in the end my program gets rewritten a lot by GHC as well, but GHC does have much more elaborate quality control.
I’d be fine with adding some pragma or another though .

Thank you for all your work!

tomjaguarpaw · May 30, 2025, 5:41pm

Do you apply the same objection to ST?

cdfa · May 30, 2025, 7:08pm

To some extent, yes. While ST does not use IO and thus provides fewer subtle ways to shoot yourself in the foot, the roots of operations like readSTRef are far from pure.

My intuition is that stuff built on ST is therefore less likely to be “correct by construction”.

tomjaguarpaw · May 30, 2025, 7:15pm

Technically not, but look at the definitions. Are you really going to say they’re different in an essential way?

newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))

newtype ST s a = ST (STRep s a)
type STRep s a = State# s -> (# State# s, a #)

Sure, but Bluefin also sequesters IO and can guarantee you can’t shoot yourself in the foot that way (similarly effectful).

It would be good if there was some way of turning that intuition formal.

cdfa · May 30, 2025, 7:35pm

Sure, but Bluefin also sequesters IO and can guarantee you can’t shoot yourself in the foot that way (similarly effectful).

From #29, it seems I can (shoot, at least in some way)

It would be good if there was some way of turning that intuition formal.

Sadly, I don’t know how. I might ask a friend who studied math.

tomjaguarpaw · May 30, 2025, 9:12pm

I mean, sure, Bluefin’s API contains an unsafe function. It could also simply not contain it, but I’m keeping it around until I work out the best course of action. It’s not some essential component of the library.

cdfa · May 31, 2025, 7:22am

I don’t think exposing unsafe functions in necessarily an issue even, but it seems that in this case it wasn’t clear that it’s unsafe at the start, and it still seems unclear when usage of the function is unsafe or not.
Or can you shed more light on this?

tomjaguarpaw · May 31, 2025, 7:48am

That’s a fair point. The problem is that it can allow handles to escape their scope. I don’t know whether that allows you to write unsafeCoerce, but I can’t rule it out. Given that withEffToIO_ requires IOE you can’t use runPureEff, so I don’t see any way that Bluefin’s use of unsafePerformIO could take place on such an escaped handle. Therefore I think you can only write code as bad as if you’d written it by hand in IO, i.e. not that bad.

(It would be different if Bluefin had handlers that give guarantees beyond what IO can give, for example allocating memory when the scope is introduced and freeing it when the scope closes.)

Still, I don’t even know how to formalise “handles escape scope” nor the exact safety property you get if they don’t, so I don’t know how to formalise conditions for safe use of withEffToIO_ (and equivalents). I guess the ST paper explains it.

But my original point was that withEffToIO_ isn’t some widely-used part of the Bluefin API. I’ve never written client code that uses it (and I don’t know if anyone else either). It’s certainly required for implementing certain sorts of IO-based handlers! I’m just saying it’s likely that most Bluefin users will never touch it.

cdfa · May 31, 2025, 10:18am

Fair enough, the issue seems specific to the combination of Bluefin.System.IO.withFile withEffToIO_, and if withEffToIO_ were removed from the public API of Bluefin, that would resolve the issue.

But that would prevent users from implementing “certain sorts of IO-based handlers” then right? Or if it becomes part of an internal API, users still need to know the correct way to use it to implement “certain sorts of IO-based handlers”.

tomjaguarpaw · May 31, 2025, 5:00pm

Yes, exactly so! Though I would add there are probably very few such “sorts of IO-based handlers” that actually need writing (bracket and catch probably cover 99% of real use cases) and even if you get it wrong it’s probably only as bad as having written the equivalent handler directly in IO wrong, which is to say a runtime crash, not a segfault or unsafeCoerce.

EDIT: Actually, it’s a bit worse than that. If anyone adds an effect which relies on the Bluefin to enforce well-scoping then anyone else can break that by using withEffToIO_ (or friends).

ymdfield · November 3, 2025, 2:58am

I was unsure about many things, so over the past few months I have been looking into performance whenever I had time. Writing it all out again now, it seems the performance in free monad based effect system libraries with respect to time complexity can be decomposed into the following three factors. (I have not yet investigated space complexity. What would be the best way to approach it?)

The O(m^2) overhead with respect to the number of instructions m in the program, as pointed out in the Reflect w/o Remorse paper

This is what I previously advertised in heftia as the speedup from freer-simple’s FTCQueue implementation. It is exactly as the paper says.

The O(n) overhead with respect to the depth of the effect stack, that is the number of effects in the type level list, due to linear search for handlers (https://x.com/ymdfield/status/1946881981981261831, Are complaints about Free Monad performance pointless, and no different to a corresponding Monad construction? - #17 by ymdfield )

This was something heftia did not solve. As the effectful family does, this can be solved by adopting an evidence-passing approach that treats a vector of handlers as the environment of ReaderT. Regarding this, the free monad itself is not the cause of the performance degradation. It is a matter of how to maintain the handler environment. We need to change the way instructions are injected into an Open Union and placed as nodes in the free monad tree. I plan to try some combinations of approaches and will report results when I have them.

The overhead of delimited continuations

This differs depending on whether you use GHC primops or a free monad. This was hinted at in earlier eff benchmarks ( heftia/benchmark/bench-result/coroutine-shallow.svg at v0.7.0.0 · sayo-hs/heftia · GitHub ), but when using primops the coroutine effect becomes extremely slow (O(m^2)). Why this difference arises is a matter of the primops implementation and is beyond my current understanding, but at least this problem does not occur with a free monad based approach. However, outside cases like this, primops should be faster than free by a constant factor only, although I have not been able to test it yet. At this point, I think the best approach is to provide both implementations and let users choose via a compile flag depending on whether they use coroutines.

Overall, as before, I do not think the free monad is the cause of slowness. I cannot state it definitively, but I do think there is a slightly misleading aspect or a subtle point there. At the very least, it may be true for a naive implementation, but that is ultimately a question of how you use it as a building block to assemble an effect system. Well, it might turn out not to be the case. I will report concrete benchmark results as they become available.

Topic		Replies	Views
[ANN] heftia-effects: higher-order algebraic effects done right Announcements	11	1696	October 13, 2024
Why use an effect system? Learn	124	4820	March 5, 2025
Why are there so many libraries for algebraic effects? Learn	57	2305	April 20, 2025
Bluefin compared to effectful [video] Show and Tell	44	4740	November 19, 2025
The issues with effect systems Learn	63	6643	February 18, 2023

[ANN] A series of articles on Heftia: The Next Generation of Haskell Effects Management

Related topics