Testing concurrent code using DejaFu

Concurrency and shared mutable state are hard to get right, and even harder to test. In Haskell, STM and DejaFu can help. I wrote a blog post introducing some patterns I recently adopted: Testing concurrent code using DejaFu » nicolast.be

6 Likes

It’d be nice if cell a could become a plain a removing the need to unpack the Identity constructor. I have a hunch this might be possible by using some closed type family, but didn’t pursue this further yet.

Indeed, this is a classic: Higher-Kinded Data :: Reasonably Polymorphic
They even have a validation example using generics to pull wrapper from the fields: f Maybe -> Maybe (f Identity).

3 Likes

Oh, thanks! I had never seen this post before, but indeed, similar mechanism, and looks like my hunch about using a closed type family may pay off :wink: I’ll experiment a bit more…

1 Like

Not as simple, it seems. Stuck on

    • Couldn't match expected type: TVar stm Int                                                                                                                                                                                              
                  with actual type: HKD (TVar stm) Int

with (as in the linked post)

type family HKD f a = r where                                                                                                                                                                                                               
  HKD Identity a = a                                                                                                                                                                                                                        
  HKD f a = f a
1 Like

Should be working. I’ve used multi-parameter wrappers with it.

data AttrsF f = Attrs
  { params     :: HKD f TextureParams
  , transforms :: HKD f Transform
  }

type Buffers = AttrsF (Allocated 'Coherent)

What’s your context?

1 Like

TVar is itself a type family – probably that is what makes inference complicated?

I’ve experimented with this and found something that compiles. It uses a type family with an extra parameter isTVar, which has the value () for TVar types and Void for all other types. Here is the relevant part:

import Data.Void

type family HKD isTVar f a = r where
  HKD () stm a = TVar stm a
  HKD Void Identity a = a
  HKD Void f a = f a

data Store isTVar cell = Store
  { storeA :: HKD isTVar cell Int,
    storeB :: HKD isTVar cell Int
  }

type FrozenStore = Store Void Identity

snapshotWith ::
  Applicative m =>
  (forall a. TVar cell a -> m a) ->
  Store () cell ->
  m FrozenStore
snapshotWith readCell store =
  Store
     <$> readCell (storeA store)
     <*> readCell (storeB store)

snapshotSTM :: MonadSTM stm => Store () stm -> stm FrozenStore
snapshotSTM = snapshotWith readTVar

snapshot :: MonadConc m => Store () (STM m) -> m FrozenStore
snapshot = atomically . snapshotSTM
2 Likes

Given some notes about type families of type families in the GHC docs, I had a suspicion that may be related indeed. I’ll give your approach a try. Any reason you’re using ()/Void and not (kind-promoted) Bool?

1 Like

No important reason. Only because I didn’t know what was idiomatic, and I was trying not to require any additional language extensions. But certainly kind-level Bool is more readable.

2 Likes

The semantic of Bools aren’t immediately apparent.

Why not match TVar directly? TVar is TVar, while Identity and anything else isn’t.

1 Like

What concerns me in the MonadConc approach is that it doesn’t (and can not) completely mimic upstream concurrency primitives. It’s basically a fork frozen since 2016.

For example you can compare a concurrently function implementation from the async and concurrency packages. There are lots of differences.

And async's async is cancelled with the AsyncCancelled exception while concurrency's one is just killed. I bumped into this difference in a real project.

So the DejaFu approach means that usual intuition does not work and concurrency functions can behave differently from what you would expect. This in part defeats the purpose of the library.

And even if somebody will magically keep MonadConc in sync with the upstream it will still work differently since it doesn’t call primitive functions in IO/STM but does it in some monad, which can change the final behavior dramatically. While simple wrappers like UnliftIO will still call base/async functions the MonadConc will call its copy of those functions calling a few primitives from the user monad.

So if you’re testing concurrent code with DejaFu you’re testing concurrency package implementation, not your usual concurrent code. And you need to always refer to concurrency docs and know the difference between it and upstream.

Maybe it’s fine in some cases but the difference needs to be clearly stated (otherwise it can lead to unexpected bugs) and it means developers onboarding and support burden.

1 Like

I agree, indeed, while the concurrency modules provide API-compatibility with some other libraries, they’re not “the same” and hence can behave differently. In a sense, when using concurrency, you’re using concurrency, not (e.g.) async. They just happen to “look the same”.

What I argue is that concurrency should (barely) exist. If the stm library itself would be abstracted over MonadSTM (or similar, likely provided by the stm package itself), and hence async and other as well, then no duplication of the implementations should exist at all.

This doesn’t only apply to concurrency/dejafu, but also the recently released io-classes/io-sim. Similarly for concurrency primitives, though finding the right primitives might be more complicated there.

Note that io-classes puts most “derived” combinators also into the respective type class (here, in MonadAsync), which means that the IO instance can reuse the implementation from async.

Fully agreeing with your larger point though :+1:

io-classes/io-sim specifically say that they do not alter original semantics. IO instances directly call original functions. So the non-testing version of the program behaves the same way as if there was no additional layer at all. They also only support ReaderT transformers as others may alter the program behavior dramatically.

The usual intuition of what to expect from a program calling forkIO/withAsync/… still works for io-classes. They learned this lesson.

Their “provide zero-cost abstractions” claim is questionable. Once you have MonadAsync m => m () it’s no longer a “zero-cost”. One needs to fix their monad to be IO (or ReaderT ... IO stack) to have direct calls to original functions. And once you have a fixed monad you can’t replace it with a testing one. Unless you use a conditional compilation or a backpack, which makes the whole type class approach questionable – why not use backpack in the first place?

As for moving stm library to MonadSTM. It makes sense from a point of view of a type class based verification library. But:

  • It’s not the only verification method (conditional compilation with CPP, backpack, linking with a custom base/stm-like package or a custom runtime, etc.). It’s better to make the standard library be usable in as many contexts as possible than to tailor it for the needs of a one library.
  • Monads change behavior (evaluation order, allocation patterns). MonadSTM might not work well with many of them. I don’t think that it’s fair to require stm/async developers to find bugs in other people’s monads. And it’s better to not provide any MonadSTM at all than to provide one but only support an IO instance (still spending time to reject numerous bug reports).
  • Every 3rd-party library can make its own decisions on how to split type classes: MonadSTM and MonadAsync, or a big MonadConcurrency, or a fine-grained MonadForkIO, MonadMVar, …? Is the standard library a good place to set this split in stone?
  • It will be harder for newcomers to understand types (why my main has type (MonadSTM m, MonadIO m) => m () instead of just IO ()?)

I think that io-sim took a better route than concurrency, but using backpack and linking a different version of stm for tests might be an even better solution (real zero-cost abstraction and no custom type classes).