What is a higher-order effect?

tomjaguarpaw · November 19, 2024, 7:46am

Bluefin uses threads to implement connectCoroutines. It seems to work well. What drawbacks were you thinking of?

osa1 · November 19, 2024, 8:55am

delimited continuations

I’m still looking for a good application of those

Some practical applications that I can think of:

Implementing concurrency libraries and tools, like dejafu.
Resumable exceptions.
Coroutines, which are useful for many things. You can make a parser written in the usual style to work incrementally, or make it yield one parse event/token at a time (instead of a list of things in one go). Some state machines can be implemented more conveniently as coroutines.

I think without control flow effects, the effect system becomes a dependency injection framework/library, with some extra features, like local discussed above.

Lysxia · November 19, 2024, 11:37am

Bluefin uses threads to implement connectCoroutines. It seems to work well. What drawbacks were you thinking of?

Rather than implementing specific effects, the problem I’m talking about is to implement algebraic effects (or delimited continuations) in general. The goal is to define a monad Free implementing the interface:

instance Monad (Free f)  -- return, (>>=)
send :: f a -> Free f a
interpret :: (forall x. f x -> (x -> Free g a) -> Free g a) -> Free f a -> Free g a
data Empty a
runFree :: Free Empty a -> a

satisfying various equations about how those functions interact, the most notable one being

interpret h (send f >>= k) = h f (interpret h . k)

There are many known ways to do this: a common starting point is the Free data type. Another way is using a higher-order effect from any one of the fancy effect libraries that support those (I think this should be possible in bluefin as well):

data SendBind f m a where
  SendBind :: f x -> (x -> m a) -> SendBind f m a

-- Simplified variant of https://hackage.haskell.org/package/effectful-core-2.5.0.0/docs/Effectful-Dispatch-Dynamic.html#v:interpret
interpret :: (forall x. h (Eff g) a -> Eff g a) -> Eff h a -> Eff g a
interpret :: (forall x. SendBind f (Eff g) a -> Eff g a) -> Eff (SendBind f) a -> Eff g a -- specialization

However implementing algebraic effects this way doesn’t buy you much compared to the naive Free data type, because continuations are represented as closures (x -> m a) which cannot be eliminated unless you know the handler has a nice shape. Another way to look at this is that although you may define Free f = Eff (SendBind f) to answer the question above, you have to use a different Monad instance than Eff’s to satisfy the required equations; so you don’t benefit from the fact that Eff is just IO under the hood, which is core to the performance of Eff.

The updated problem statement is to implement Free where (>>=) behaves the same as in IO or ReaderT _ IO.

The primitives prompt#/control0# give such a solution. The cost in exchange is that send :: f a -> Free f a, which is really control0#, traverses the stack to copy it and find a matching prompt#. That’s not great.

OCaml palliates this issue using “segmented stacks”: the stack is represented as a linked list so instead of copying “stack segments” we manipulate pointers; and handlers are always at the end of a segmented stack, so you can find the closest handler in constant time without searching frame by frame.

My crazy idea is that GHC already has something similar to segmented stacks: threads. The irony is that OCaml uses segmented stacks to achieve concurrency, and here I’m doing the opposite. Each thread has its own stack and they can pass control to each other by waiting on MVars. With that, it should be possible to implement algebraic effects in IO with reasonable asymptotics.

In the code linked below, client code ((>>=) and send) simply consists of MVar operations. On the other hand, the handler interpret is more complicated, but optimizing the client does not rely on knowing the handler and vice versa.

Gist: Algebraic effects implemented using threads · GitHub

tomjaguarpaw · November 19, 2024, 12:01pm

Oh I see. Are you sure you can implement multi-shot continuations using threads?

Lysxia · November 19, 2024, 12:24pm

No, that’s one of the many catches

sgraf · November 19, 2024, 12:53pm

Using threads sounds like a neat idea, because it means we do not need to copy the stack, in contrast to control0#. Still, forking and creating MVars is not free… I guess we would need to look at benchmarks.
For example, when using the forking strategy to implement effects in concurrency, I imagine that the number of threads spawned will at least double because every send will spawn a new thread, even for effects that are not ForkIO.

Let me try to summarise the different implementation strategies based on effect class, so that we know what cases we need to optimise for:

Users should be able to declare that an effect (rather than just one of its handlers) is supposed to be tail-resumptive (i.e., fun in Koka), then we do not need to bother with control0# in this case and get great perf. I would think that the vast majority of effect use cases is tail-resumptive. Ömer called providing handlers for extensible non-control effects “dependency injection” and I’m inclined to agree.
If users request a control effect, they opt into some overhead, be it through extraneous forkIO and MVars or through use of control0#.
Whether forking or control0# is faster for single-prompt delconts is an open question
People who want multi-prompt delconts will (probably?) need to keep using control0#.

Lysxia · November 21, 2024, 3:38am

Coming back to the topic of higher-order effects, I don’t find local to be a very motivating example. An “operation that takes computation as arguments” seems very fancy at first, but there’s not a lot that can be done with an argument which is a computation: basically you can run the computation, in which case a handler for a higher-order effect boils down to some extra code that you run before and after the given computation. We may also come up with handlers that run the computation more than once, or under some additional handlers, and I conjecture that there are not that many “shapes” for handlers of higher-order effects, and that once we have enumerated them all, we will find that they can be desugared in terms of first-order effects.

In the simplest case where all existing higher-order handler are assumed to be “brackets” around the argument computation, they can be desugared to two first-order effects to call before and after the computation. That is enough to express the examples mentioned above: the standard handler for local (by modifying the state and then restoring it) and the logOnSpan example that inserts logging before and after the argument action.

@tomjaguarpaw By including local in bluefin’s Reader effect you gain expressiveness but lose “reasonability”: with only ask, we immediately knew that all uses of ask bound to the same handler will return the same value; with local we now need to keep track of local on the call stack to determine the value of ask. The point is that "ask only" and “ask+local” are two meaningfully different interfaces. Without local, we had one neat equation:

runReader v (\c -> f (ask c)) = f (pure v)

I’m not sure how strongly I am going to push for an "ask only" interface to be part of bluefin (maybe passing values directly works just as well?), but for the sake of argument, consider this similar situation: why even have “ask+local” (Reader) when “get+put” (State) is more expressive anyway?

One last pet peeve I’d like to talk about is how overloaded and confusing the adjectives “dynamic” and “static” are. There at least three possible meanings in this discussion already.

First, there is the meaning of Effectful.Dispatch.Dynamic/...Static, which is a distinction which could also be made in bluefin as well. To set the scene, effectful and bluefin are two realizations of the “handle pattern”: “handles” are resources created by “handlers” (interpret, runState, etc.), and there are operations that use those “handles” to do stuff.

But what is a handle actually? You could say that a handle is data: it could be a value or reference that you get/set, a file descriptor which lets you do IO, a flag to enable a feature, etc. We might call this the “static handle” model, and that is what Dispatch.Static deals with. Or you could say that a handle is executable code: the send operation executes that code. We might call this the “dynamic handle” model: Dispatch.Dynamic. Instead of “dynamic” vs “static”, one could call this distinction “handles as code” vs “handles as data”.

These are two complementary models: dynamic handles generalize static handles, as closures over the relevant data; and static handles generalize dynamic handles since code is data. I find this distinction to be the least interesting because it is “just” a special case of an ubiquitous code-data duality.
The second “dynamic/static” distinction is the one that separates effectful and bluefin. It is a question of scope, by analogy with variable binding: how to determine what handler handles what operation? For a precise example, consider an expression h (f g1 g2) with a handler h :: Eff (e : es) x -> Eff es x, two operations g1, g2 :: Eff (e : es) x of effect e, and some function f :: Eff (e : es) x -> Eff (e : es) x -> Eff (e : es) x. Does h handle the operations g1 and g2?

With dynamically scoped handlers (effectful), the answer depends on f: it could be that f handles the operations g1 or g2 or both. With statically scoped handlers (bluefin), the expression actually looks more like h (\c -> f (g1 c) (g2 c)), and the use of the variable c guarantees that g1 and g2 are handled by h.

As a digression, note that dynamically scoped handlers can emulate static scoping using polymorphism: if f has type forall e. Eff (e : es) x -> Eff (e : es) x -> Eff (e : es) x then e is abstract to f which prevents f from handling it. However the details can be subtle: in some effect systems (such as one where all effects are defined in terms of what effectful calls “(dynamic or static) dispatch” (see above)), f could handle an abstract operation, send it upwards (because that’s the only thing it could do with e), and insert some actions before and after the sending. Thus, f may detect that some operations were called, which may or may not be a desirable side channel.
The third meaning is when people use “dynamic” to refer to interpose :: e :> es => Handler e es -> Eff es x -> Eff es x, which “dynamically” installs a new handler for e, hiding an older handler. This may seem closely related to the previous point: in the example expression h (\c -> f (g1 c) (g2 c)), surely f can’t interpose a handler to hide h, because it is statically bound to its operations via c. However you can choose to explicitly allow interposition by rewriting that example to thread the handle c through f: h (\c -> f c (\c' -> g1 c') (\c' -> g2 c').

Another way to approach the same idea: if we name effectful’s monad Eff and we name bluefin’s monad B, then we can morally view Eff in terms of B as Eff es x = Hdls es -> B es x, where Hdls is the type of handles for the effect row es. The type of interpose unfolds to e :> es => Handler e es -> (Hdls es -> B es x) -> (Hdls es -> B es x), which makes explicit how handles are threaded through it.

Note that this is not the only way to “dynamically change the handler” in the statically scoped world. Another technique is to make handles Hdls mutable, which enables a variant of interpose that works even when the handle is passed directly from a handler to its operations without a middle-man. Going back to the running example from earlier, you could write h (\c -> f c (g1 c) (g2 c)) to pass a mutable handle c to f and let it “act at a distance” on g1 and g2.

tomjaguarpaw · November 21, 2024, 8:30am

Lysxia:

@tomjaguarpaw By including local in bluefin’s Reader effect you gain expressiveness but lose “reasonability”: with only ask, we immediately knew that all uses of ask bound to the same handler will return the same value; with local we now need to keep track of local on the call stack to determine the value of ask. The point is that “ask only” and “ask+local” are two meaningfully different interfaces. Without local, we had one neat equation:
runReader v (\c -> f (ask c)) = f (pure v)
I’m not sure how strongly I am going to push for an “ask only” interface to be part of bluefin (maybe passing values directly works just as well?), but for the sake of argument, consider this similar situation: why even have “ask+local” (Reader) when “get+put” (State) is more expressive anyway?

Right, but I’m beginning to think the ask+local Reader interface is the Reader interface. I previously (naively) assumed that the notion of Reader was just an abstraction of (->), i.e. threading an immutable value around. After this discussion I begin to believe that local modifications of an otherwise immutable value are an essential component of that API.

It does satisfy a nice property though: these two operations are always equivalent:

do
  r1 <- ask re
  r2 <- ask re
  pure (r1, r2)

do
  r <- ask re
  pure (r, r)

(and maybe an even more powerful equivalence can be stated).

tomjaguarpaw · November 21, 2024, 1:37pm

The short answer is, yes, Bluefin can do that! See example below.

This is very interesting to me. First I thought that reader state should be immutable. Then I realized that the point of local is to mutate the reader state in a well-scoped way. Now I realize that local is not immutable either, and interpose can be used to modify it in a well-scoped way. My next thought is: why stop at local? Why not allow interpose of local itself to be modified in a well-scoped way? And then … where do we stop? Can’t we do that indefinitely, at each step allowing ourselves to modify the thing that modifies the level below?

Unfortunately it was too vague for me to be able to replicate it in Bluefin. I would be grateful if you could write a small working sample in effectful that demonstrates the nature of interpose modifying local in a Reader effect, so that I can confirm that what I have implemented in Bluefin is a match for it.

Requires unpublished Bluefin: bluefin/bluefin-internal/src/Bluefin/Internal.hs at 5bf5efad6620f51ae98d417f6a8f9c925abfba5b · tomjaguarpaw/bluefin · GitHub

-- ghci> example
-- inside first local
-- local will add trace local1
-- Entering modified local: local1
-- inside second local
-- local will add trace local2
-- Entering modified local: local2
-- Entering modified local: local1
-- inside third local
example :: IO ()
example = runEff $ \io ->
  forEach
    ( \y -> do
        runDynamicReader () $ \re -> do
          dynamicLocal id re $ do
            yield y "inside first local"
            withTracingLocal y re "local1" $ do
              yield y "local will add trace local1"
              dynamicLocal id re $ do
                yield y "inside second local"
                withTracingLocal y re "local2" $ do
                  yield y "local will add trace local2"
                  dynamicLocal id re $ do
                    yield y "inside third local"
    )
    (effIO io . putStrLn)

tomjaguarpaw · November 21, 2024, 2:13pm

I wondered this, and the answer is because it’s less expressive. It’s not such a big deal when the reader state is non-effectful, but when it contains effects it becomes essential! For example, in a Bluefin branch I have a handle called H, which is essentially a “reader of a handle”. It supports the following operations. In particular, localH allows us to set the reader state locally to a value that is only permissible within the block (contains effects es), not outside (only effects e)!

localH :: (e :> es, Handle h) => H h e -> h es -> Eff es r -> Eff es r

askH :: (e :> es, Handle h) => H h e -> Eff es (h es)

runH ::
  (e1 :> es, Handle h) =>
  h e1 ->
  (forall e. H h e -> Eff (e :& es) r) ->
  Eff es r

ymdfield · November 21, 2024, 2:42pm

That’s a sharp insight. I believe this idea—that higher-order effects can eventually be desugared into first-order effects—is precisely the concept presented in the Hefty Algebra paper. In the paper, they refer to the desugaring operation that lowers higher-order effects into first-order ones as elaboration, which I feel perfectly captures the nuance. When I think about higher-order effects, I naturally have this mental model as well.

I agree. I feel that we should assign specific terms to each of these three (or perhaps even more?) dynamic/static distinctions.

(I reposted this because I mistakenly specified the reply destination. Sorry.)

jaror · November 21, 2024, 3:36pm

Note that the elaboration in the Hefty Algebras paper refers to rewriting the higher order operations to uses of handlers (usually), not to inserting first-order operations before and after the higher order block. So, I think you and @Lysxia are not exactly on the same page.

ymdfield · November 21, 2024, 4:14pm

Ah, indeed. For example, when considering the elaboration of the catch higher-order effect, we can see this difference in that, rather than desugaring catch into a collection of throws, it’s desugared into a handler for throw. I was too hasty.