The issues with effect systems

They’re merged and will be released in the upcoming GHC 9.6. So, we won’t have to wait that long any more.

Delimited continuations primops are about 1.5-2x as fast (compared to speff which seems to be the next fastest) in micro-benchmarks that mostly only test the overhead of the effect system itself. I’d still call that significant. Of course in many applications the effect system itself is not the bottleneck.

They use the same delimited continuations as eff, but then encoded as a “normal” Haskell data type. The implementation in that new paper is not that much different from the implementation described in this 2005 paper which is implemented in the CC-delcont package on Hackage. But I do agree that delimited continuations are more difficult to grok than the free monad.

4 Likes

Interesting! I did not realize that speff is currently the fastest effect system (well, until 9.6 releases… so, a few weeks?). It’s very interesting how that works… I’m really interested in bringing that performance to free-monad implementations like polysemy, though! It seems the performance error is because of the bind operator, which allocates a tree of closures, which eventually are walked. Could the bind operator be optimized in some way using any information to allocate a flat line… I guess? I really have no clue what I’m talking about, but I think free-monad implementations require less effort overall to implement, and I wonder if there’s a way to clear up the issue with bind.

effecful basically is this, in a well typed way. It seems like effectful is strictly better than ReaderT ... IO in every dimension, except perhaps the types are more complex (but they should be - tracking effects in types is the point!)

1 Like

speff seems to be better than effectful too, in that it seems to be way faster? Or does effectful have some sort of edge over speff?

Is it? Do you have some benchmarks you could point to? I would be surprised if it could be, since effectful is just doing IO, nothing fancy. (I suppose looking stuff up in the environment might be fancy, but speff has to do that too I think.)

1 Like

I think they’re about evenly matched on performance, comparing these numbers on speff with these numbers on effectful using the shallow fused effects as reference.

1 Like

It seems like effectful is strictly better than ReaderT ... IO in every dimension

Besides being faster, what are other benefits of effectful over ReaderT?

I’m not sure it is faster, but it’s typed. You can see what effects any block of code has. That’s most of the point of Haskell, in my view!

2 Likes

You can see what effects any block of code has.

That’s indeed useful, but you can have it in ReaderT-like approaches too. I wondered about that in this other thread.

Like, imagine that you have a record like

data Repository m = Repository
  { findById :: ResourceId -> m Resource,
    save :: Resource -> m ()
  } 

And a constructor that uses a Has-like class to find its dependencies in a dependency injection environment:

makeRepository :: (Has Logger m deps, Has SomeOtherDep m deps) 
               => deps -> Repository m
makeRepository = undefined

That seems roughly analogous to an effectful interpreter for some Repository effect that delegates on the Logger and SomeOtherDep effects.

Maybe it’s a question of taste, but maybe there are other benefits in the effectful way.

Oh, perhaps I should have been clearer. Rather than “see what effects a block of code has” I should have said “see what effects a block of code does not have”. For example, the type of foo guarantees it has no effects.

foo :: Eff es (Either String Int)
foo = runErrorNoCallStack (if True then throwError "Hello" else pure 5)

There’s no way to do the above in anything that uses IO as a base type. That is to say, effectful (and other effect systems) allow you to remove effects. ReaderT ... IO doesn’t.

2 Likes

However, you could instead use an abstract data type based on a newtype declaration:

module Follow (
  Follow, follow,
  openLog, addLog, saveLog, ...
) where
{-- assorted imports -}

newtype Follow a = Follow (IO a)
follow :: Follow a -> IO ()
openLog :: Filepath -> Follow LogFile
addLog :: LogFile -> Message -> Follow ()
saveLog :: LogFile -> Follow () 
                    ⋮

As always, it will depend on what’s required - sometimes an ADT will suffice; other situations may require the granularity provided by the tracking of individual effects. Just remember that there can be more that one way to achieve a result.

Yes, that’s exactly what effectful does! (And allows mixing of effects by keeping track of them in a type level parameter.)

Another potential benefit is that effect systems provide flexibility when testing effectful code.

…provided the number of independent effects is kept relatively small, because the number of combinations of effects will be exponential: n effects → 2n potential combinations to test with.

But since effect systems these days are trying as much as possible to decrease the performance issues when using big effect stacks, doesn’t that mean that these combinations should be fine? Unless you mean that it makes testing difficult, but I believe you just change the handler of the effect to another handler that does what you need for testing, right?

[…] I believe you just change the handler of the effect to another handler that does what you need for testing, right?

That only helps if [the use of] each effect is tested in isolation. If effects are going to be combined, then those combinations will also have to be tested. Now imagine [potentially] needing a separate test for each combination of effects…


[…] since effect systems these days are trying as much as possible to decrease the performance issues when using big effect stacks, doesn’t that mean that these combinations should be fine?

If you have 32 independent effects, that’s 232 potential combinations of those effects - over 4 billion of them to test.

I think testing effect implementations themselves and testing effectful functions are different. If you aim to test whether an effect you developed is well behaved with other effects, then what you described is a potential problem (and maybe a reason why we should ensure the properties in the effect system formally instead of test them).

However for effectful functions, effects enable a form of mock testing - we swap out the implementation for certain effects that the function requires for mock ones, like an in-memory database and FS for the respective DB and FS effects. This has nothing to do with testing exponential numbers of effect combinations.

I’ll try to sum up the Haskell effect system lore that I have had some participation in, which is from a reasonably early point, i.e. from circa 2020 up to now. Since you can’t watch Alexis’ videos, this hopefully serves as a rudimentary substitute as well.

There had been free(r) monads-based implementations for quite some years as of 2020, most popular (read: most usable) ones are freer-simple and polysemy. They are slow because they essentially construct a program tree at runtime, and then the handlers traverse the tree to slowly materialize a program. freer-simple is actually often respectably fast though, but this brings us to another topic of our story: higher-order effects.

Higher-order effects means you can create effects with operations that take effectful computations as arguments. That probably sounds confusing; essentially, it allows operations like WithFile :: FilePath -> (Handle -> Eff es a) -> Eff es a. The second argument is an effectful computation because it returns an Eff. If your effect library supports higher-order effects, you can write different handlers for this operation: for example, one reading files from a web server while the other reading from your local hard disk. Many people think this is a good thing to have; freer-simple doesn’t have it however, but polysemy does (in an awkward way, but it does the job most of the times).

Then there are fused-effects which everyone believes is fast because it is supposed to “fuse away” intermediate structures… well, it has the same problem as mtl: if the code doesn’t specialize, compiler has no way to fuse them away! For your code to specialize reliably, the only way is to add {-# SPECIALIZE #-} all over your functions, which is very not ideal; not only is it tedious to write, this also slows down compilation.

So the situation is not exactly good. Alexis King and Ningning Xie almost simultaneously had an idea circa 2020 (I believe; I don’t know if they had communicated about the idea before), which is to use evidence-passing and delimited continuations to build an effect system in Haskell that would be both fast and expressive enough; specifically, in Alexis’ vision, it will support fast higher-order effects.

What is evidence-passing? It basically means to pass the effect handlers around in the Eff monad, so instead of handlers traversing a big program tree like in free monads, we directly call the handlers in place. This is a much more performant approach, and it’s pretty shocking that nobody thought of this before. Nowadays, eff, eveff, mpeff, speff, cleff, and effectful all use evidence passing.

Delimited continuations are nothing new actually, and has been used to develop other languages with effects like Koka since at least 2015. Probably nobody thought of doing this in Haskell before because Haskell didn’t support it natively, and they didn’t think it would be efficient. Ningning came up with a monad that can do delimited continuations fairly fast, which is used in eveff, mpeff and speff. On the other hand Alexis decides to just add native delimited continuations to GHC and went down that way with eff.

OK, then what did cleff and effectful use instead of delimited continuations? The answer is nothing: they had different tradeoffs in expressiveness compared to the other libraries mentioned - they do not support nondeterminism, so that they can support MonadUnliftIO. This consequently eliminates the need for delimited continuations. Why we can’t support both nondeterminism and MonadUnliftIO? It is perhaps less well known that many IO functions accessible via MonadUnliftIO, mainly those that use fork and bracket, only work when the control flow is deterministic. This has not been a problem since the beginning since Haskell was a deterministic language - but not now! Future effect system users will probably face a choice between nondeterminism-capable libraries and MonadUnliftIO-capable libraries and they will need to choose based on their specific needs.


Hope that has fulfilled your curiosity. About your questions on effect system implementations in Haskell:

In reality, implementations could be way faster than they are right now (eff, polysemy, eveff, speff, mpeff)

As you can see, evidence-passing based implementations are already very fast, and increasing user adoption, instead of squeezing out more performance, is our primary goal now.

Eff uses delimited continuations but is waiting for some primops to be merged to gain speedups

It’s merged and to be shipped in GHC 9.6!

eveff/speff/mpeff seem to use a new paper to implement effects that seems really weird

One thing I keep saying is that users of effect systems should not need to know about the internals. If an effect system forces users to do so, it has failed. But on the flip side, “implementation technique sounds weird” should not scare you away from using a library - you only need to interact with the API after all! However I must warn you that these 3 libraries are all proofs of concepts, and should not be used in production.

In some cases, they’re more complex than required.

Polysemy’s API is… not ideal. But as you mentioned, new libraries’ APIs are getting intuitive and easy-to-use, so you can try out those libraries more instead. We can’t control what others think, really; and the notoriety of the last wave of effect libraries cannot be simply erased. But I’d continually encourage Haskellers I meet to give the new generation of effect libraries a chance.

18 Likes

…so if the action being tested uses n effects, that would mean 2n potential combinations of mock and regular implementations are required. The main difference would be that the average action would only use a few of the set of available effects, so testing is more plausible.

Yes, but what you’re describing is still testing the effect implementations, not the effectful computations. I believe both the original comment by @tcard and I are referring to the advantage of effect systems concerning testing effectful computations.

About testing effect implementations, I really think we should prove the properties formally instead. Or maybe I’m wrong and we actually should test them, in which case what you described is an unsolved problem.

1 Like