Parallel property-based testing with a deterministic thread scheduler

https://stevana.github.io/parallel_property-based_testing_with_a_deterministic_thread_scheduler.html

2 Likes

It seems you are not familiar with: GitHub - input-output-hk/io-sim: Haskell's IO simulator which closely follows core packages (base, async, stm). (io-sim: A pure simulator for monadic concurrency with STM.). It currently doesn’t support first class thread schedulers but it is deterministic and one can go a long way with random attenuation (delays, errors) in the simulation if done right. We have a lot of success using it at IOG networking team :slight_smile:

I’m aware of io-sim, but I’m not aware of any document which describes how it works. It seems to me that it reimplements parts of the run-time system, which seems like a lot of work and error prone.

The approach in this post, due to matklad, seems much simpler (can be explained in full in a relatively a short post). I know that io-sim is more comprehensive in some regards, but not all regards, e.g. it doesn’t support IORefs.

I’ve also never seen a simple example of how to combine io-sim with stateful and/or parallel property-based testing. The traces that io-sim produces and the dynamic logic of quickcheck-dynamic somehow checks is again not explained anywhere (that I’ve seen).

I’d be curious to hear from somebody that understands how io-sim works what the eventual pitfalls are with the approach taken in this post, i.e. what things will be difficult to achieve with this simpler approach.

1 Like

I see, yes you are right, we should definitely write something about how it works in more detail. But in a nutshell it is just a free monad like data structure that has a pure interpretation that follows the IO/GHC run-time system semantics. We have tests that verify the pure implementation against the real behavior so, yes it was quite a piece of work but it is quite robust and correct.

The way it schedules threads and deals with exceptions and what not is not much different very similar to what you describe in your post, but it has the advantage that it also has a notion of time and is pure. So in the end you can write your business logic in io-sim, interpret it in IO for your executable and test it with quickcheck pure-ly, everything with the same code.

I am not familiar with quickcheck-dynamic but I see it was developed by Quviq which also had their part in io-sim by implementing the partial order reduction (POR) feature that can find and swap concurrent threads and find even more possible race conditions. I also see in the README of quickcheck-dynamic reference to io-sim so there might be something there explaining.

I haven’t thought about time in this context, but I’ve simulated time before, so I got some ideas. Will add a to do item about it, thanks!

So in the end you can write your business logic in io-sim, interpret it in IO for your executable and test it with quickcheck pure-ly, everything with the same code.

It would be interesting to see a port of the counter example with increments from the post to your approach. In particular I’m curious to see how you can test different interleavings of threads.

1 Like

It would be interesting to see a port of the counter example with increments from the post to your approach. In particular I’m curious to see how you can test different interleavings of threads.

Your example uses IORef which don’t exist yet in io-sim, however I don’t think it is something one can’t implement :slight_smile: Regarding testing different interleaving of threads there are actually 3 approaches:

  1. threadDelays. Since time in io-sim does not pass unless something block either via threadDelay or STM, instrumenting the code with threadDelays to simulate time passing between computations is one way to get different interleavings of threads. Essentially we generate some random delays with Quickcheck which get used randomly in the instrumented code. This is particularly useful in networking code where we want to simulate connections and packet transmission times (and failures!). Note that this is still deterministic. And for each test we might find different interleavings. The nice thing about this is that shrinking will work beautifully.

  2. Either improve io-sim to have first class thread schedulers so the user can have finer grained control (I think dejafu has this) or we could simply add random jitter to every call of threadDelay in the code. Since the free monad data structure is “ours” we can instrument the code on io-sim side (note that this is not done, I am just suggesting it as a way to do it)

  3. Using io-sim-por; this approach has its costs, but if one wants to be sure it searches all the possible race condition search space effectively and tries out all the sensible race conditions, this is the thing to use. io-sim-por will find all concurrent threads that can lead to a race condition, reverse them and see what happens :slight_smile:

It should be possible to implement a broken counter with a TVar as well.

It should be possible to implement a broken counter with a TVar as well.

Yes, I only meant that the example wouldn’t be 100% equal to yours.

That’s not important.