Okay. I didn’t like the “discard laziness altogether” approach. So I thought about it again - if the problem is lazy bytestring doing these linear indexing/reading functions then it probably would also help performance if we kept the laziness but stopped using the position to index the variable and instead consumed the bytestring linearly.
Rewrote it to not nest quite as much and this does cut performance but not as much as having a strict bytestring (it’s still twice as slow as a strict bytestring and I’m not sure why) - but laziness might help you for very large files so it’s worth improving the lazy approach.
decompressSingleChunk ::
BSL.ByteString ->
MV.STVector (MV.PrimState (ST s)) Int64 ->
Int ->
Int64 ->
ST s ()
decompressSingleChunk !s !mutableVector !outPos !value
| outPos >= MV.length mutableVector = pure ()
| otherwise = case readDelta s of
Nothing -> pure ()
Just (!d, !rest) -> do
{-# SCC "writeV" #-} MV.unsafeWrite mutableVector outPos (value + d)
decompressSingleChunk rest mutableVector (outPos + 1) (value + d)
readDelta :: BSL.ByteString -> Maybe (Int64, BSL.ByteString)
readDelta bs = do
(!b, !rest1) <- BSL.uncons bs
let !delta8 = fromIntegral (fromIntegral b :: Int8) :: Int64
if -127 <= delta8 && delta8 <= 127
then pure (delta8, rest1)
else readDelta16 rest1
where
readDelta16 !rest1 = do
(!delta16, !rest2) <- {-# SCC "read16" #-} runGetMaybe getInt16le rest1
if -32767 <= delta16 && delta16 <= 32767
then pure (fromIntegral delta16, rest2)
else readDelta32 rest2
readDelta32 !rest2 = do
(!delta32, !rest3) <- {-# SCC "read32" #-} runGetMaybe getInt32le rest2
if -2147483647 <= delta32 && delta32 <= 2147483647
then pure (fromIntegral delta32, rest3)
else readDelta64 rest3
readDelta64 !rest3 = do
(!delta64, !rest4) <- {-# SCC "read64" #-} runGetMaybe getInt64le rest3
pure (delta64, rest4)
runGetMaybe :: Get a -> BSL.ByteString -> Maybe (a, BSL.ByteString)
runGetMaybe !g !bs = case runGetOrFail g bs of
Left _ -> Nothing
Right (!rest, _, !v) -> Just (v, rest)
I just threw a bunch of bangs in there. Some of them might not make much of a difference.