This problem does give very interesting benchmark results. As long as K is less than about 6% of N the lazy version is faster (the lazy version is basically instant if K=1), but otherwise the strict version is faster (up to about 10x faster it seems):
list 100000 99999: OK
973 ms ± 20 ms, 5.5 GB allocated, 218 MB copied, 14 MB peak memory
list 100000 6000: OK
278 μs ± 8.1 μs, 2.6 MB allocated, 152 KB copied, 14 MB peak memory
list 100000 1000: OK
37.6 μs ± 2.2 μs, 445 KB allocated, 3.9 KB copied, 14 MB peak memory
list 100000 1: OK
11.3 ns ± 786 ps, 239 B allocated, 0 B copied, 14 MB peak memory
vector 100000 99999: OK
99.2 ms ± 4.5 ms, 354 MB allocated, 24 KB copied, 14 MB peak memory
vector 100000 6000: OK
232 μs ± 15 μs, 1.5 MB allocated, 95 B copied, 17 MB peak memory
vector 100000 1000: OK
230 μs ± 16 μs, 1.5 MB allocated, 66 B copied, 18 MB peak memory
vector 100000 1: OK
231 μs ± 13 μs, 1.5 MB allocated, 85 B copied, 18 MB peak memory
And interestingly, -fllvm
is able to do a better job with the unboxed vectors:
list 100000 99999: OK
931 ms ± 48 ms, 5.5 GB allocated, 218 MB copied, 14 MB peak memory
list 100000 6000: OK
267 μs ± 23 μs, 2.6 MB allocated, 151 KB copied, 14 MB peak memory
list 100000 1000: OK
34.6 μs ± 2.7 μs, 443 KB allocated, 3.8 KB copied, 14 MB peak memory
list 100000 1: OK
9.56 ns ± 660 ps, 239 B allocated, 0 B copied, 14 MB peak memory
vector 100000 99999: OK
82.6 ms ± 4.8 ms, 354 MB allocated, 24 KB copied, 14 MB peak memory
vector 100000 6000: OK
183 μs ± 17 μs, 1.5 MB allocated, 74 B copied, 17 MB peak memory
vector 100000 1000: OK
181 μs ± 16 μs, 1.5 MB allocated, 67 B copied, 17 MB peak memory
vector 100000 1: OK
181 μs ± 14 μs, 1.5 MB allocated, 64 B copied, 18 MB peak memory
Full benchmark code:
import Data.Vector.Unboxed qualified as VU
import Data.List (partition, unfoldr)
import Test.Tasty.Bench
reseto :: Int -> Int -> Int
reseto n k = concat (unfoldr step [2..n]) !! (k-1)
where
step [] = Nothing
step xs@(x:_) = Just $ partition ((== 0) . (`mod` x)) xs
reseto' :: Int -> Int -> Int
reseto' n k = VU.concat (unfoldr step (VU.enumFromTo 2 n)) VU.! (k-1)
where
step xs
| VU.null xs = Nothing
| otherwise = Just $ VU.partition ((== 0) . (`mod` VU.head xs)) xs
main = defaultMain
[ bench "list 100000 99999" $ nf (uncurry reseto) (100000, 99999)
, bench "list 100000 6000" $ nf (uncurry reseto) (100000, 6000)
, bench "list 100000 1000" $ nf (uncurry reseto) (100000, 1000)
, bench "list 100000 1" $ nf (uncurry reseto) (100000, 1)
, bench "vector 100000 99999" $ nf (uncurry reseto') (100000, 99999)
, bench "vector 100000 6000" $ nf (uncurry reseto') (100000, 6000)
, bench "vector 100000 1000" $ nf (uncurry reseto') (100000, 1000)
, bench "vector 100000 1" $ nf (uncurry reseto') (100000, 1)
]
Edit: We can improve the performance a bit using rem
instead of mod
:
vector 100000 99999: OK
86.6 ms ± 4.2 ms, 353 MB allocated, 20 KB copied, 14 MB peak memory
vector 100000 6000: OK
198 μs ± 16 μs, 1.5 MB allocated, 71 B copied, 19 MB peak memory
vector 100000 1000: OK
195 μs ± 18 μs, 1.5 MB allocated, 67 B copied, 20 MB peak memory
vector 100000 1: OK
195 μs ± 17 μs, 1.5 MB allocated, 64 B copied, 20 MB peak memory