I’m interested in parallel stream processing. Not the map/reduce kind of
parallelism, but rather pipeline parallelism like in manufacturing and CPU
instruction pipelines.
I’ve recently made some progress using an Arrow
interface and leveraging
associated types to avoid copying data between queues that connect the different
stages of the pipeline:
Parallel stream processing with zero-copy fan-out and
sharding
This is heavily inspired by Martin Thompson et al’s work on the LMAX Disruptor
and Aeron. As I was writing things up I also stumbled upon an interview with the
late Jim Gray where he advocated for pipeline parallelism.
As far as I know, no streaming library in Haskell uses this approach, but I’d be
happy to be proven wrong. I would also be happy to try to answer any questions
or listen to any kind of feedback, cheers!