Monoidal dupable channels

nomeata · September 23, 2022, 2:45pm

A useful data structure is the channel, with an API that could look like this (I use different types for the input and the output handle, so this does not quite look like Control.Concurrent.Chan)

data CIn a
data COut a
newChan   :: IO (CIn a, COut a) 
writeChan :: CIn a -> a -> IO () 
readChan  :: COut a -> IO (Maybe a)

The idea is that the channel forms a queue, writeChan adds entries to the back, and readChan pops the first entry (returning Nothing if there is none).

So far so good.

Here is a variant: Instead of storing a queue of elements to be popped one by one, it may be preferrable if the channel just stores some aggregate value: If multiple elements are written to the channel before they are consumed, they can be combined using (<>):

newChan   :: IO (CIn a, COut a) 
writeChan :: Semigroup a => CIn a -> a -> IO () 
readChan  :: COut a -> IO (Maybe a)

I put the constraint on writeChan, and not readChan, expecting the library to use (<>) eagerly (and likely strictly) as it updates the value waiting for the reader.

Still so far so good; an implementation with a single MVar would probably suffice.

But what if we want to allow multiple readers! For example, by adding a

dupChan   :: COut a -> IO (COut a)

with the idea that the two output handles both receive all updates. How would a reasonable implementation of that look like?

One naive possibility is to keep a list of MVar a’s around, one for each output channel generated, and writeChan c x updates all of them. But that is wasteful, as it will do (<> x) to each of them, likely duplicating work.

So I am wondering if the underlying data structure somehow contains the sequence of values added to the queue, with each COut’s pointing to a different positions in that sequence, and readChan moves that pointer to the end, and combines all the visited values with (<>) in a way that all the other readers still behind get to use that result. Maybe with the guarantee that every value written to the channel gets passed to (<>) at most once, no matter how many readers and when the readers read.

But there are various trade offs and it makes my head spin. So – did someone build a data structure like this already?

treblacy · September 23, 2022, 4:58pm

If I read an item off of a queue, I expect that the queue also loses that item as a result.

By extension, if I read off of this monoid channel, I expect that the channel resets itself to zero (mempty). This implies that readChan needs the Monoid constraint too; and BTW newChan also needs it for initialization.

I haven’t thought of dupability.

nomeata · September 23, 2022, 7:59pm

Ah, despite the title I am actually using Semigroup in the proposed interface, thinking that it may be worth distinguishing “no new value” from mzero. But either works fine, I think.