Composing or piping multiple Folds

Control.Foldl packs a lot of interesting folds and folds transformers such as premap and purely but nowhere was I able to find a ‘folds transformer’, i.e. a function of type

transfold :: Fold a b -> Fold b c -> Fold a c and that would allow me to extract and compose the step functions of a plurality of folds, making the following code type check:

fold (transfold increment1 double) [1,2,3] // purely hypothetically: expecting: [3,5,7] assuming a 'double' and an 'increment1' fold.

Because there is no such utility one is stuck writing ‘parallel’ folds with applicatives as in

fold ( (/) <$> length <*> sum ) [1,2,3] // assuming a folder 'length'

where the folds don’t share results, or composing not folds per se but plain functions injected into a fold’s step function as in:

fold (premap (*2) increment1) [1,2,3] // [3,5,7] assuming an 'increment1' fold

which is fine but wastes the potential of preexisting folds which by the way could compose with each other.

In a nutshell, then, it seems that the whole library is premised on the idea that folds in the scope of a single call of a fold function can never pipe their results. I am not sure if this premise is fine if you want to make the most of already usable folds.

Is there a way around this shortcoming?

You should remember that a Fold a b takes a “stream” of as and produces a single value of type b, so the transfold function should have type Fold a [b] -> Fold b c -> Fold a c. I think these kinds of functions are usually called stream transformers. They are present in streaming libraries like pipes, conduit, streaming and streamly (streamly also uses the foldl library).

Streamly has the following example in its README which shows how you can pipe a stream through multiple transformers. Composition is simply done with function application (x & f & g = g (f x)).

import Streamly
import qualified Streamly.Prelude as S
import Data.Function ((&))

main = S.drain $
       S.repeatM getLine
     & fmap read
     & S.filter even
     & S.takeWhile (<= 9)
     & fmap (\x -> x * x)
     & S.mapM print
2 Likes

Hello, thanks for your reply, you might be correct about the use case of Folds – and the concept of their eagerly consuming streams of data. But to turn my initial question into a remark, I would say it is surprising that if you want to express a filter as a Fold, you cannot combine it with any other Fold inside a call to the fold function as per this library (Control.Fold). You have to either prefilter and filter outside of the call. This strikes me a very akward, given that filters are paradigmatic Folds and that all Folds have step functions matching in kind: x -> a -> x.

The main difference between ‘just folds’ and ‘list streaming’ is that those streaming functions also produce a list, whereas the folds only consumes a list.

So something like Fold b c -> Fold a b -> Fold a c isn’t possible, because the Fold a b needs to produce a list of b to pass to Fold b c, whereas it just produces a single b. Fold just isn’t designed for this, unfortunately.

As @dramforeversaid before, Folds collapse a list into a single value whereas a stream produces another stream of values.

You could cleverly use Control.Scanl to compose Folds and Scans to create what you need. This would be an example, taken from the double + increment problem that you posted above:

let doubleAndIncrement = Foldl.premap ((+1) . (*2))
let identityFold = Foldl.Fold (\x a -> a) 0 id
Scanl.scan (Scanl.postscan (doubleAndIncrement identityFold)) [1,2,3] // [3, 5, 7]

Which I know it is not what you wanted to do.

“foldl” does have a function whose signature resembles composition, hidden behind the Semigroupoid instance:

o :: Fold j k -> Fold i j -> Fold i k

What it really does, however, seems a bit non-obvious:

instance Semigroupoid Fold where
    o (Fold step1 begin1 done1) (Fold step2 begin2 done2) = Fold
        step
        (Pair begin1 begin2)
        (\(Pair x _) -> done1 x)
      where
        step (Pair c1 c2) a =
            let c2' = step2 c2 a
                c1' = step1 c1 (done2 c2')
            in  Pair c1' c2'

For each iteration, it seems to obtain the “done” result of the first fold, and feed it to the step function of the second one.

I once wrote a package called foldl-transduce for performing splitting and grouping operations on folds. But, as already mentioned in the thread, it feels more “natural” to perform those kinds of operations on producers, not on sinks.