Why not use smallcheck?

Hah, I wanted to write a comment here about how the enumeration could order the candidates based on the sum of all the integers in the list (e.g. [1,1] would come immediately after or before [2]). But then you’re still generating an exponential amount of tests. It turns out that 18 is still kind of reachable at 131.072 tests before you reach replicate 18 1. However, it takes many more tests before you get to any kind of representative array of 18 numbers. With every value of the sum you add, the number of tests required doubles. I’m guessing you were lucky that the bug appeared for an array with 17 zeroes and a single one (or something like that), which would “only” require between 262.144 and 524.288 tests.


Thinking about it again, maybe you could do much better if you made and ordering based on merging different orders of magnitude. Consider generating tests for a single integer. We could consider the ranges 1-10, 10-100, 100-1000, etc. separately and then merge their orders. So you’d get 0, 10, 100, …, 1, 11, 101, …, 2, 12, 102, …
For lists (or vectors) you’d first test length 1, then length 10, then 100, etc.

I guess one problem will be that the larger lists will take much longer to tests, so maybe we want to bias this more to the smaller lists. Also, this means we would probably want shrinking again.

1 Like

You asked me about SmallCheck, not about your library. Here is a test suite I based my previous answer on:

#!/usr/bin/env cabal
{- cabal:
build-depends: base, tasty, tasty-smallcheck
default-language: GHC2021
-}

import Test.Tasty
import Test.Tasty.SmallCheck
import Data.List (sort)

main = defaultMain $
  -- depth 19 means that we are building lists of length 0..18
  localOption (SmallCheckDepth 19) $
    testProperty "mySort matches the reference implementation" $
      \xs -> mySort xs == sort xs

mySort :: [Int] -> [Int]
mySort xs
  | length xs < 18 = sort xs
  | otherwise = error "explode"

This completed ~65000 of tests in 10 minutes on my machine and does not seem to be anywhere near the counterexample. If your library completes 200000+ tests in 0.3 seconds (which is roughly 6000 times faster), that’s a great result, but it does not relate to SmallCheck.

2 Likes