One Billion Row challenge in Hs

Yeah, I think it would be great if flatparse would provide a “value strict” Parser.


I have now also implemented a little adapter to make flatparse work with lazy bytestrings:

-- needs to be top-level now
f :: T -> T -> T
f (T a b c d) (T a' b' c' d') = T (min a a') (max b b') (c+c') (d+d')

runParserLazy :: (a -> a -> a) -> a -> Parser e a -> B.ByteString -> Result e a
runParserLazy k z p = B.foldlChunks step (OK z mempty) where
  step (OK x bs) bs' = k x <$> runParser p (bs <> bs')
  step x _ = x

main :: IO ()
main = do
  str <- B.readFile "measurements.txt"
  OK x _ <- pure $ runParserLazy (M.unionWith f) M.empty parseMeasurements str
  M.foldrWithKey (\k (T a b c d) go -> print (k, a, b, c/d) *> go) (pure ()) x

That doesn’t decrease the time on my machine by much, it is still 1.15 seconds. And it reduces the maximum residency from 138,019,888 to 207,456.

1 Like