Consider the naïve “list compression” function implemented as
compress_naive :: forall a. Eq a => [a] -> [(a, Int)]
compress_naive = \case
[] -> []
a : sa' ->
let (sa'l, sa'r) = span (a ==) sa'
in (a, 1 + length sa'l) : compress_naive sa'r
The above gets the point across mathematically, but if we care about performance then we should leverage the fact that compress streams over its input. In particular, we might rewrite it fold/build-fusibly as
{-# INLINE compress #-}
compress :: forall a. Eq a => [a] -> [(a, Int)]
compress = \ sa -> GHC.List.build $ \ g b ->
foldr
( \ a' k -> \case
Just (a, !n) -> case a == a' of
True -> k $ Just (a, n + 1)
False -> g (a, n + 1) . k $ Just (a', 0)
Nothing -> k $ Just (a', 0) )
( \case
Just (a, !n) -> g (a, n + 1) b
Nothing -> b )
( sa )
( Nothing )
with the “state of the loop” (i.e., of the left part of the folding of sa, achieved via the continuation trick) represented as a value of type Maybe (Int, Int) in which each iteration of said loop is fully (i.e., recursively) strict and wherein the Nothing constructor represents the absence of a value already consumed from the input against which to compare for equality.
(If this sucks/is otherwise wrong in some obvious way other than legibility, please let me know!)
Okay, let’s see how well GHC can optimize this in a reasonable context. Consider the “Fibonacci mod 3” sequence on a given pair of initial values defined as
{-# INLINE fibMod3 #-}
fibMod3 :: Int -> Int -> [Int]
fibMod3 = \ a0 a1 ->
fmap fst $ iterate' (\ (!a0', !a1') -> (a1', flip rem 3 $ a0' + a1')) (a0 `mod` 3, a1 `mod` 3)
and in turn let the function we wish to optimize be
compressFib :: Int -> Int -> Int -> Int
compressFib = \ a0 a1 n ->
sum . fmap (\ (a, i) -> a `div` i) . compress . filter (0 <) . take n $ fibMod3 a0 a1
The details are mostly arbitrary; the point is that on the one hand GHC (which is apparently otherwise quite clever at finding ways to do so) cannot easily prepare in advance for whether compress will be passed an empty input (when it can—for instance in the variation on the above with take n pushed left of compress—it tends to optimize the above exactly as I would have hoped, being able to fully eliminate the Maybe) and yet on the other it is intuitively clear how to rewrite the above to a tight, unboxed loop.
Unfortunately, when I compile the above with -fforce-recomp -O2 -fexpose-all-unfoldings -fspecialize-aggressively -funbox-strict-fields (GHC version 9.12.2), the “inner loop” is a mess spread across several not-fully-unboxed functions (ignoring wrappers):
$wn :: Maybe (Int, Int) -> Int# -> Int#
compressFib1 :: Int -> (Maybe (Int, Int) -> Int -> Int) -> Maybe (Int, Int) -> Int -> Int
compressFib_$s$wgo :: Int# -> Int# -> Int# -> Maybe (Int, Int) -> Int -> Int
But a (fully, i.e., recursively) strictly-consumed argument of type Maybe (Int, Int) can be unboxed to one of type (# (# #) | (# Int#, Int# #) #) and then represented in turn as three Int#s.
So what gives?