You might also be able to build this on top of something like Ed’s
You might also be able to build this on top of something like Ed’s
I worked a bit on the subproblem of a thread-safe implementation of the
Thunk data type.
The goal is to have this abstract API:
type Thunk type KickedThunk thunk :: IO [KickedThunk] -> IO Thunk doneThunk :: IO Thunk kick :: Thunk -> IO KickedThunk force :: Thunk -> IO ()
with the properties that
thunkis lazy in its argument, and does not run it directly
- the fist
kicktriggers execution of the action passed to
- the action is run at most once
forcetriggers execution of the action passed to
thunk, and does not return until
- the action is executed
- the action of any thunk kicked by the action is execution
- Cycles are allowed: The action passed to
thunkmay kick that thunk
- It is thread safe: Even if multiple threads
kickrelated thunks, all actions are still run at most once, and all calls to
forceterminate (no deadlock)
Especially the last point is not so trivial, if two threads start processing the same strongly connected component from different ends. One solution would be to use the
ThreadId to choose which thread gets to explore the full strongly connected component, while the other then just waits.
So this code might work:
type Thunk = MVar (Either (IO [KickedThunk]) KickedThunk) data ResolvingState = NotStarted | ProcessedBy ThreadId (MVar ()) | Done data KickedThunk = KickedThunk (MVar [KickedThunk]) (MVar ResolvingState) thunk :: IO [KickedThunk] -> IO Thunk thunk act = newMVar (Left act) doneThunk :: IO Thunk doneThunk = do mv_ts <- newMVar  mv_s <- newMVar Done newMVar (Right (KickedThunk mv_ts mv_s)) -- Recursively explores the thunk, and kicks the execution -- May return before before execution is done (if started by another thread) kick :: Thunk -> IO KickedThunk kick t = takeMVar t >>= \case Left act -> do mv_thunks <- newEmptyMVar mv_state <- newMVar NotStarted let kt = KickedThunk mv_thunks mv_state putMVar t (Right kt) kts <- act putMVar mv_thunks kts pure kt -- Thread was already kicked, nothing to do Right kt -> do putMVar t (Right kt) pure kt wait :: KickedThunk -> IO () wait (KickedThunk mv_deps mv_s) = do my_id <- myThreadId s <- takeMVar mv_s case s of -- Thunk and all dependences are done Done -> putMVar mv_s s -- Thunk is being processed by a higher priority thread, so simply wait ProcessedBy other_id done_mv | other_id < my_id -> do putMVar mv_s s readMVar done_mv -- Thunk is already being processed by this thread, ignore ProcessedBy other_id done_mv | other_id == my_id -> do putMVar mv_s s pure () -- Thunk is not yet processed, or processed by a lower priority thread, so process now _ -> do done_mv <- newEmptyMVar putMVar mv_s (ProcessedBy my_id done_mv) ts <- readMVar mv_deps mapM_ wait ts -- Mark kicked thunk as done _ <- swapMVar mv_s Done -- Wake up waiting threads putMVar done_mv () force :: Thunk -> IO () force t = do rt <- kick t wait rt
Ocaml, which isn’t a pure language, so you can pull some neat tricks using mutable references, without making your whole code monadic
As a bonus, you gonna inflict so much pain on any Haskeller who tries to port your code
Well, not if I can wrap it up in a safe pure API as explored here - now the Ocaml programmer has the pain of usung the API correctly, but the Haskell programmer has it easy
I watched Kmett’s talk on propagators, and maybe that gives me a bit more terminology to describe what I think I’m doing here, as there are two parts:
LBoolprovides a monadic API to set up a propagator network, albeit a very simple one: All lattices involved are the two-point lattice consisting of “could be
True” < “definitely
False”, and the only propagators is
implies. It has a decent implementation to solve the propagation (I think it may even be optimal in some sense, but that’s easy with a two-point lattice). It requires the programmer to pay attention to register all relevant edges in the graph using
impliesbefore reading off values using
I think this can be built on top of Kmett’s
Cell, with a newtype around
Boolto have a special
merge False _ = Change False False.
PBoolbuilds on top of that, but wraps it in a pure and safe API (consisting of
unPBool). This way, the programmer can define the propagator network using simple functional programming, without dealing with monads.
I am more excited about the latter, and I wonder how it compares to other approaches. For example, the
propagators library in
Prop.hs seems to achieve something similar, using
reifyGraph so that one can also use pure programs to define the program, but it seems it all has to happen within
backwards, so you can’t easily read off the cells as you go. So I hope I have hit a separate point in the design space… But I need to carve it out more precisely.
I should try to apply my construction to Kmett’s
Cell data structure from the propagator, as an alternative to his
Prop.hs, for a better comparison.
Indeed, I could take my approach and apply it to Edward’s
Started to put it into a a stand-alone library:
Data.Recursive.Propagatorcontains a very naive propagator library in IO
Data.Recursivewraps it up in pure functions, and documents safety requirements
Data.Recursive.Setprovide safe APIs for some example types
example.hsshows what you can do with them
Sorry if this sounds somewhat negative in tone, but if you’re intending to publish this package to Hackage could you please reconsider the choice of module name “
Data.Recursive”? That is a very general name and a very valuable part of the namespace. It might be better to take that name only after a package has become firmly established as widely used in the ecosystem.
(If the package is not to be published on Hackage then by all means use whatever names you like, of course!)
I keep renaming the whole things, and I am happy to take suggestions. How would you call a type that allows recursive definitions?
I certainly want to publish it to hackage; I think the
RSet API is useful in a fair number of applications, and that this is not just a toy experiment. Renaming the the module after it is established isn’t really a good idea either, though, so let’s better find a good name right away
What would you think if all my modules are below
Data.Recursive, but do not actually use
If this approach is as generic as it seems, there’s a possibility of it (eventually!) ending up as a basic feature of Haskell, like
Maybe or the list - would it be worthwhile to keep it as a candidate package for as long as possible in Hackage, as a precaution?
RVal: people trying to search for it would encounter many (mostly) unrelated matches.
Trying to find a suitable name for, well anything can be an “interesting” process - some time ago, I chose a name for a type class; as I dimly recall, it was the better part of two years before the search engines stopped suggesting alternative words with similar spelling. The inability to remove a misnamed package from Hackage only adds to the challenge: “once wrong, always wrong”…
…before I forget again, the following paper could be describing something similar:
- Encapsulating non-determinacy in an abstract data type with determinate semantics, F. Warren Burton, JFP 1991.
Based on the name, I thought it had a connection to this paper (same author), but it describes a different concept which seems to resemble your approach.
(As for “reinventing wheels”: these papers have only recently been made publicly available by the publishers - in the absence of easily-obtainable prior art, how are you supposed to know if something’s already been invented?
Ah right, R-values already exist.
One more name idea:
Data.Solidary, because solidaire means interdependent or mutually dependent in French but even in English the word is associated with mutual connections between people.
Right now I am going with
rec-def for the package name, and using various modules under the
Data.Recusive.* namespace, without using
Hopefully that’s non-squatty enough.
Interesting! I have a hunch that you might be able to provide a similar top level interface, perhaps less efficiently, with neither IO nor unsafePerformIO nor thunks, etc? In particular, the most naive thing is to let the R monad record tuples of names and trees of possible assignments with results, and to have
getR explicitly solve for a minimal set of satisfying assignments. So its really a sort of “labeled nondeterminism” monad in which you’re computing the least fixed point?
The immediate issue then presented is that any binding like
let x = ... would need to explicitly also give an “x” as a name. So this sort of shows what’s going on is IO (or unsafePerformIO really) is being used to solve unique label identity. In turn this reminds me of “observable sharing” a la observable-sharing: Simple observable sharing and Observable Sharing
I seem to recall some more recent work on the topic, but not exactly where…
R is not a monad, and that’s the point! It’s just a wrapper around the values, and you can pass them around like normal ones, and use them even recursively. And I strongly doubt you can implement a similar interface without unsafe code underneath.
The observable sharing work is highly related, but uses (if I remember correctly) even less savory tricks underneath, and is less composable, which is why I like the present implementation.
The library is now maybe ready for a first public release. There are tests, haddock reports 100% documentation coverage, and the overall structure is maybe fine. See https://haskell-rec-def.nomeata.de/ for the full haddocks an in particular https://haskell-rec-def.nomeata.de/Data-Recursive-Examples.html for an introduction.
Would you use this? If not, what would stop you from using it?
I have now published
rec-def on hackage, and wrote a blog post about how to use it:
More on the implementation follows later.
I’ve just noticed that you can do many of the things you advertise also with the
lattices package, e.g.:
import Algebra.Lattice import qualified Data.Set as S import qualified Data.Map as M transitive g = lfpFrom g $ \sets -> M.mapWithKey (\v vs -> S.insert v (S.unions [ sets M.! v' | v' <- S.toList vs ])) sets main = print $ transitive $ M.map S.fromList $ M.fromList [(1,[2,3]),(2,[1,3]),(3,)]
I guess your package has the advantage of allowing more natural code. But it requires using custom functions like
rInsert and the
R type wrapper, so I’m not sure which is easier to use.
I guess performance is another advantage of your approach.
Thanks for the pointer, I was not aware of that package!
But it requires using custom functions like
True! That is so because I want
Data.Recursive.Set to provide a safe interface, and only expose monotonic functions. An alternative interface allowing (lifting of) arbitrary functions is possible as well, but would be a bit less safe.
More on the implementation of the part that turns an imperative propagator network API into a pure API published: