Solving cyclic boolean implications with pure code and laziness

You might also be able to build this on top of something like Ed’s promises library.

I worked a bit on the subproblem of a thread-safe implementation of the Thunk data type.

The goal is to have this abstract API:

type Thunk
type KickedThunk
thunk :: IO [KickedThunk] -> IO Thunk
doneThunk :: IO Thunk
kick :: Thunk -> IO KickedThunk
force :: Thunk -> IO ()

with the properties that

  • thunk is lazy in its argument, and does not run it directly
  • the fist kick triggers execution of the action passed to thunk
  • the action is run at most once
  • force triggers execution of the action passed to thunk, and does not return until
    • the action is executed
    • the action of any thunk kicked by the action is execution
  • Cycles are allowed: The action passed to thunk may kick that thunk
  • It is thread safe: Even if multiple threads force or kick related thunks, all actions are still run at most once, and all calls to force terminate (no deadlock)

Especially the last point is not so trivial, if two threads start processing the same strongly connected component from different ends. One solution would be to use the ThreadId to choose which thread gets to explore the full strongly connected component, while the other then just waits.

So this code might work:

type Thunk = MVar (Either (IO [KickedThunk]) KickedThunk)
data ResolvingState = NotStarted | ProcessedBy ThreadId (MVar ()) | Done
data KickedThunk = KickedThunk (MVar [KickedThunk]) (MVar ResolvingState)

thunk :: IO [KickedThunk] -> IO Thunk
thunk act = newMVar (Left act)

doneThunk :: IO Thunk
doneThunk = do
    mv_ts <- newMVar []
    mv_s <- newMVar Done
    newMVar (Right (KickedThunk mv_ts mv_s))

-- Recursively explores the thunk, and kicks the execution
-- May return before before execution is done (if started by another thread)
kick :: Thunk -> IO KickedThunk
kick t = takeMVar t >>= \case
    Left act -> do
        mv_thunks <- newEmptyMVar
        mv_state <- newMVar NotStarted
        let kt = KickedThunk mv_thunks mv_state
        putMVar t (Right kt)

        kts <- act
        putMVar mv_thunks kts
        pure kt

    -- Thread was already kicked, nothing to do
    Right kt -> do
        putMVar t (Right kt)
        pure kt

wait :: KickedThunk -> IO ()
wait (KickedThunk mv_deps mv_s) = do
    my_id <- myThreadId
    s <- takeMVar mv_s
    case s of
        -- Thunk and all dependences are done
        Done -> putMVar mv_s s
        -- Thunk is being processed by a higher priority thread, so simply wait
        ProcessedBy other_id done_mv | other_id < my_id -> do
            putMVar mv_s s
            readMVar done_mv
        -- Thunk is already being processed by this thread, ignore
        ProcessedBy other_id done_mv | other_id == my_id -> do
            putMVar mv_s s
            pure ()
        -- Thunk is not yet processed, or processed by a lower priority thread, so process now
        _ -> do
            done_mv <- newEmptyMVar
            putMVar mv_s (ProcessedBy my_id done_mv)
            ts <- readMVar mv_deps
            mapM_ wait ts
            -- Mark kicked thunk as done
            _ <- swapMVar mv_s Done
            -- Wake up waiting threads
            putMVar done_mv ()

force :: Thunk -> IO ()
force t = do
    rt <- kick t
    wait rt
1 Like

Ocaml, which isn’t a pure language, so you can pull some neat tricks using mutable references, without making your whole code monadic

As a bonus, you gonna inflict so much pain on any Haskeller who tries to port your code :tada:

Well, not if I can wrap it up in a safe pure API as explored here - now the Ocaml programmer has the pain of usung the API correctly, but the Haskell programmer has it easy :slight_smile:

1 Like

I watched Kmett’s talk on propagators, and maybe that gives me a bit more terminology to describe what I think I’m doing here, as there are two parts:

  • LBool provides a monadic API to set up a propagator network, albeit a very simple one: All lattices involved are the two-point lattice consisting of “could be True” < “definitely False”, and the only propagators is implies. It has a decent implementation to solve the propagation (I think it may even be optimal in some sense, but that’s easy with a two-point lattice). It requires the programmer to pay attention to register all relevant edges in the graph using implies before reading off values using canBeTrue.

    I think this can be built on top of Kmett’s Cell, with a newtype around Bool to have a special Propagated instance with merge False _ = Change False False.

  • PBool builds on top of that, but wraps it in a pure and safe API (consisting of pFalse, all and unPBool). This way, the programmer can define the propagator network using simple functional programming, without dealing with monads.

I am more excited about the latter, and I wonder how it compares to other approaches. For example, the propagators library in Prop.hs seems to achieve something similar, using reifyGraph so that one can also use pure programs to define the program, but it seems it all has to happen within forward and backwards, so you can’t easily read off the cells as you go. So I hope I have hit a separate point in the design space… But I need to carve it out more precisely.

I should try to apply my construction to Kmett’s Cell data structure from the propagator, as an alternative to his Prop.hs, for a better comparison.

1 Like

Indeed, I could take my approach and apply it to Edward’s propagator code:

2 Likes

Started to put it into a a stand-alone library:

https://github.com/nomeata/haskell-rec-val/.

  • Data.Recursive.Propagator contains a very naive propagator library in IO
  • Data.Recursive wraps it up in pure functions, and documents safety requirements
  • Data.Recursive.Bool and Data.Recursive.Set provide safe APIs for some example types
  • example.hs shows what you can do with them

Sorry if this sounds somewhat negative in tone, but if you’re intending to publish this package to Hackage could you please reconsider the choice of module name “Data.Recursive”? That is a very general name and a very valuable part of the namespace. It might be better to take that name only after a package has become firmly established as widely used in the ecosystem.

(If the package is not to be published on Hackage then by all means use whatever names you like, of course!)

1 Like

I keep renaming the whole things, and I am happy to take suggestions. How would you call a type that allows recursive definitions?

I certainly want to publish it to hackage; I think the RSet API is useful in a fair number of applications, and that this is not just a toy experiment. Renaming the the module after it is established isn’t really a good idea either, though, so let’s better find a good name right away :slight_smile:

What would you think if all my modules are below Data.Recursive, but do not actually use Data.Recursive itself?

1 Like

Perhaps Data.RecVal or Data.RVal?

If this approach is as generic as it seems, there’s a possibility of it (eventually!) ending up as a basic feature of Haskell, like Maybe or the list - would it be worthwhile to keep it as a candidate package for as long as possible in Hackage, as a precaution?


https://cseweb.ucsd.edu/~ricko/CSE30/Lvalues%20and%20Rvalues.htm

…maybe not RVal: people trying to search for it would encounter many (mostly) unrelated matches.

Trying to find a suitable name for, well anything can be an “interesting” process - some time ago, I chose a name for a type class; as I dimly recall, it was the better part of two years before the search engines stopped suggesting alternative words with similar spelling. The inability to remove a misnamed package from Hackage only adds to the challenge: “once wrong, always wrong”



…before I forget again, the following paper could be describing something similar:

Based on the name, I thought it had a connection to this paper (same author), but it describes a different concept which seems to resemble your approach.

(As for “reinventing wheels”: these papers have only recently been made publicly available by the publishers - in the absence of easily-obtainable prior art, how are you supposed to know if something’s already been invented? ;-)

1 Like

Ah right, R-values already exist.

One more name idea: Data.Solidarity or Data.Solidary, because solidaire means interdependent or mutually dependent in French but even in English the word is associated with mutual connections between people.

Right now I am going with rec-def for the package name, and using various modules under the Data.Recusive.* namespace, without using Data.Recursive itself:

See https://haskell-rec-def.nomeata.de/ for the haddocks (repo now at https://github.com/nomeata/haskell-rec-def).

Hopefully that’s non-squatty enough.

1 Like

Interesting! I have a hunch that you might be able to provide a similar top level interface, perhaps less efficiently, with neither IO nor unsafePerformIO nor thunks, etc? In particular, the most naive thing is to let the R monad record tuples of names and trees of possible assignments with results, and to have getR explicitly solve for a minimal set of satisfying assignments. So its really a sort of “labeled nondeterminism” monad in which you’re computing the least fixed point?

The immediate issue then presented is that any binding like let x = ... would need to explicitly also give an “x” as a name. So this sort of shows what’s going on is IO (or unsafePerformIO really) is being used to solve unique label identity. In turn this reminds me of “observable sharing” a la observable-sharing: Simple observable sharing and Observable Sharing

I seem to recall some more recent work on the topic, but not exactly where…

Note that R is not a monad, and that’s the point! It’s just a wrapper around the values, and you can pass them around like normal ones, and use them even recursively. And I strongly doubt you can implement a similar interface without unsafe code underneath.

The observable sharing work is highly related, but uses (if I remember correctly) even less savory tricks underneath, and is less composable, which is why I like the present implementation.

The library is now maybe ready for a first public release. There are tests, haddock reports 100% documentation coverage, and the overall structure is maybe fine. See https://haskell-rec-def.nomeata.de/ for the full haddocks an in particular https://haskell-rec-def.nomeata.de/Data-Recursive-Examples.html for an introduction.

Would you use this? If not, what would stop you from using it?

1 Like

I have now published rec-def on hackage, and wrote a blog post about how to use it:

https://www.joachim-breitner.de/blog/792-More_recursive_definitions

More on the implementation follows later.

6 Likes

I’ve just noticed that you can do many of the things you advertise also with the lattices package, e.g.:

import Algebra.Lattice
import qualified Data.Set as S
import qualified Data.Map as M

transitive g = 
  lfpFrom g $ \sets -> 
    M.mapWithKey 
      (\v vs -> S.insert v (S.unions [ sets M.! v' | v' <- S.toList vs ]))
      sets

main = print $ transitive 
  $ M.map S.fromList $ M.fromList [(1,[2,3]),(2,[1,3]),(3,[])]

I guess your package has the advantage of allowing more natural code. But it requires using custom functions like rInsert and the R type wrapper, so I’m not sure which is easier to use.

I guess performance is another advantage of your approach.

2 Likes

Thanks for the pointer, I was not aware of that package!

Would it be fair to say that rec-def is to lattice’s lfpFrom as expressing recursion with let is to using fix?

But it requires using custom functions like rInsert

True! That is so because I want Data.Recursive.Set to provide a safe interface, and only expose monotonic functions. An alternative interface allowing (lifting of) arbitrary functions is possible as well, but would be a bit less safe.

More on the implementation of the part that turns an imperative propagator network API into a pure API published:

https://www.joachim-breitner.de/blog/793-rec-def__Behind_the_scenes

4 Likes