Foldl traverses with State, foldr traverses with anything

tomjaguarpaw · January 9, 2024, 12:03pm

Thanks. Would you mind adding the following to your benchmarks?

findR'' f =
  foldr (flip (\xs x -> if f x then Just x else xs)) Nothing . reverse

jaror · January 9, 2024, 12:30pm

Oddly enough slightly better than findR:

  findR'' expensive: OK
    79.5 ns ± 7.8 ns, 519 B  allocated,   0 B  copied, 6.0 MB peak memory
  findR'' cheap:     OK
    46.3 μs ± 4.1 μs, 234 KB allocated, 6.7 KB copied, 7.0 MB peak memory

But what’s the difference between foldl and \k z -> foldr (flip k) z . reverse?

jaror · January 9, 2024, 12:36pm

Using reverse breaks fusion:

  findR   expensive fused: OK
    55.6 ns ± 3.5 ns, 671 B  allocated,   0 B  copied, 7.0 MB peak memory
  findR   cheap fused:     OK
    24.0 μs ± 1.8 μs, 312 KB allocated,  12 KB copied, 7.0 MB peak memory

  findR'  expensive fused: OK
    691  ns ±  55 ns, 645 B  allocated,   0 B  copied, 7.0 MB peak memory
  findR'  cheap fused:     OK
    19.1 μs ± 1.6 μs, 156 KB allocated,   9 B  copied, 7.0 MB peak memory

  findR'' expensive fused: OK
    238  ns ±  21 ns, 2.0 KB allocated,   0 B  copied, 7.0 MB peak memory
  findR'' cheap fused:     OK
    130  μs ± 6.8 μs, 937 KB allocated,  44 KB copied, 7.0 MB peak memory

That’s when I add these benchmarks:

  , bench "findR   expensive fused" $ nf (\(x,y,z) -> findR (even . fib) (enumFromThenTo x y z)) (20,19,1 :: Int)
  , bench "findR   cheap fused" $ nf (findR  even . uncurry enumFromTo) (1,10000 :: Int)

tomjaguarpaw · January 9, 2024, 12:38pm

what’s the difference between foldl and \k z -> foldr (flip k) z . reverse?

foldl builds a closure of size O(n) before proceeding to evaluate it. foldr ... . reverse works in constant space (assuming the list can be garbage collected as it is reversed).

Taken literally my statement is, sadly, obviously false. I would like to find a criterion that distinguishes “genuine” uses of foldl from uses that “should be” a foldr on the reversed list, but I’m not sure what that criterion should be. My belief is still that “genuine” uses of foldl should always be foldl' instead.

jaror · January 9, 2024, 12:51pm

This benchmark seems to contradict you:

{-# NOINLINE xs #-}
xs :: [Int]
xs = [1..10000]

triviall, trivialr :: [a] -> ()
triviall = foldl const ()
trivialr = foldr (flip const) () . reverse
{-# NOINLINE triviall #-}
{-# NOINLINE trivialr #-}

main = defaultMain
  [ bench "trivial foldl" $ whnf triviall xs
  , bench "trivial foldr.reverse" $ whnf trivialr xs
  ]

  trivial foldl:         OK
    226  μs ±  17 μs, 390 KB allocated,  19 KB copied, 7.0 MB peak memory
  trivial foldr.reverse: OK
    67.3 μs ± 6.2 μs, 234 KB allocated, 6.7 KB copied, 7.0 MB peak memory

It does allocate slightly less, but it still allocates.

tomjaguarpaw · January 9, 2024, 12:54pm

Oh foldr ... . reverse definitely allocates. It allocates an entire new list! The question is the maximum residency. In principle foldr ... . reverse can work in constant extra space, foldl always uses O(n) extra space.

tomjaguarpaw · January 9, 2024, 1:01pm

Ah, how about this criterion: I don’t know of a use of foldl that shouldn’t be either foldl', or foldr ... . reverse. It be interested know if there are any counterexamples. I can just about believe that there are counterexamples for delicate performance reasons, but beyond that I feel confident that there aren’t.

jaror · January 9, 2024, 1:04pm

I’m still not convinced so I did some heap profiling and got these pictures: (nevermind this, I was doing something wrong)

tomjaguarpaw · January 9, 2024, 1:07pm

I’m still not convinced

Convinced about what? If you do an evaluation by hand with pen and paper I think you’ll be convinced about the overall principle. What GHC actually does in practice is another matter, and something I know less about.

tomjaguarpaw · January 9, 2024, 1:23pm

I found the general rule about when to use foldr vs foldl’ a bit vague

According to the Principle of Least Power you should use foldl' in preference to foldr when you can.

@hasufell, yes, perhaps I should have connected the dots a bit more explicitly here. What I mean by “when you can” is “when you want to do what foldl' does” according to the article, that is, traverse a list with (only) a state.

Thanks for linking Alexis’s post. I’ll link to it from my article. My problem with her rules of thumb is that they presuppose that you already have an (“accumulation”) function (which is also a misnomer: in the case of foldr it doesn’t accumulate). That it, it’s a rule that you can use if you start with a function, to determine which fold to use with that function. By contrast, I’m trying to present a rule that you can use if you start with a problem, to determine which fold can solve your problem. If your problem is “traverse with (only) a state”-shaped then the answer is foldl'. If it’s “traverse with anything else”-shaped then the answer is foldr (or in either case you could just use for_). Does that explanation help? If so I wonder if I can incorporate it into the article somehow, or perhaps write another article with more examples.

Hi @tobz619, yes, your description of how forStateFromFoldl works is correct.

Use foldl' when you must absolutely traverse the whole list, reason being you are very much interested in updating a state with each value and you want to do it in constant space.

Yes, or you could just say “Use foldl' when you’re traversing the list with (only) a state”.

Use foldr when your function is lazy in the second argument …

I’m trying to get away from talking about (accumulator) functions because that presupposes one already has one, and if one does then one is likely to already know if one needs foldl or foldr. Instead, I’d say “Use foldr when you’re traversing a list in any way that requires having keeping track of (only) a state”.

jaror · January 9, 2024, 1:25pm

That’s a good idea, working through the examples manually sheds more light on it:

foldr k z = go where
  go [] = z
  go (x:xs) = k x (foldr k z xs)

reverse l = go l [] where
  rev [] a = a
  rev (x:xs) a = rev xs (x:a)

const' _ y = y

foldr const' () (reverse (1:2:3:[]))
go (reverse (1:2:3:[]))
go (rev (1:2:3:[]) [])
go (rev (2:3:[]) (1:[]))
go (rev (3:[]) (2:1:[]))
go (rev [] (3:2:1:[]))
go (3:2:1:[])
const' 3 (go (2:1:[]))
go (2:1:[])
const' 2 (go (1:[]))
go (1:[])
const' 1 (go [])
go []
()

foldl k z = go z where
  go a [] = a
  go a (x:xs) = go (k a x) xs

foldl const () (1:2:3:[])
go () (1:2:3:[])
go (const () 1) (2:3:[])
go (const (const () 1) 2) (3:[])
const (const (const () 1) 2) 3
const (const () 1) 2
const () 1
()

Conclusion: both approaches work in constant (additional) space! And both approaches do O(n) allocations (closures or cons cells).

So your statement:

Is technically true, but it seems to imply that foldl uses more space than foldr ... . reverse, which is not true (assuming cons cells and closures are the same size, which is not quite true in practice).

tomjaguarpaw · January 9, 2024, 1:36pm

Ah yes, I see this is rather subtle. foldl builds a closure of size O(n) before proceeding to evaluate it, reverse builds a list of size O(n) before proceeding with foldr. I considered the former “extra” but not the latter, but you’ve convinced me that’s not a valid thing to do.

So yes, their operation ought to be roughly the same in terms of allocations and when memory is freed. Perhaps in foldl's favour, it “fuses” the reverse with the operation k. I’m not sure if GHC’s foldr fusion does this automatically for foldr-and-reverse (because reverse is written in terms of foldl, which is written in terms of foldr). So perhaps I ought to say "foldl might be useful in some cases where you want to fuse foldr and reverse, even though writing the latter is clearer. I don’t know yet whether the latter optimizes to the former anyway.".

jaror · January 9, 2024, 1:42pm

That’s only in the report version (toggled with the USE_REPORT_PRELUDE flag). The definition I show above is the definition that GHC usually uses.

And foldr _ _ (foldl _ _ _) doesn’t fuse. It would have to use a build in between, which the report version doesn’t do.

And we can see that no fusion is happening in this benchmark I showed: Foldl traverses with State, foldr traverses with anything - #23 by jaror

tomjaguarpaw · January 9, 2024, 1:51pm

Oh, so it is. I wonder why that is. I don’t see any benefit to writing it out explicitly, given that you could just inline foldl. Are there cases where GHC.List.foldl is worse than writing it out by hand?

In any case, I’m happy to say I understand now that foldl might in principle be a useful optimization of foldr ... . reverse, although I don’t understand the circumstances in which it would be better, and I would prefer it was called fusedFoldrReverse! Thanks for working it through with me, @jaror.

treblacy · January 9, 2024, 11:53pm

I am (or was) very fascinated with the foldl-from-foldr old riddle.

I disagree with the pair of sentiments (the solution is difficult to understand, the composition foldlFromForState. forFromFoldr is mechanical and needs no brain power)!

In one direction, the role of Const (Endo _) and why/how it gets foldrFromFor to work is equally difficult/easy to understand. Anything you say to explain that, I can translate it to explain foldl-to-foldr directly.

In the other direction, the only non-mechanical “creative” brain power needed to solve foldl-from-foldr is a clever change in the argument order. Then the rest is mechanically recognizable as following the foldr idiom.

altFoldl :: (b -> a -> b) -> [a] -> (b -> b)
altFoldl op = go
  where
    go [] = \a -> a
    go (x:xs) = \a -> (go xs) (op a x)
              = (\x r -> \a -> r (op a x)) x (go xs)
∴
altFoldl op = foldr funny id
  where
    funny = \x r a -> r (op a x)
          -- The following massage will become useful below.
          = \x r -> r . (flip op x)

In fact, learning from @Bodigrim’s point, the central idea is Dual Endo:

absFoldl :: (b -> a -> b) -> [a] -> Dual (Endo b)
absFoldl op = foldr endure mempty
  where
    endure = \x r -> Dual (Endo (flip op x)) <> r

Next project for you! Instead of for_ involving both Foldable and Applicative, can you stick to Foldable alone and make foldlFromFoldMap, foldrFromFoldMap, and foldMapFromFoldr? They will show much more clearly the central role of Endo (and Dual Endo—you can basically look at absFoldl above and extract the foldMap version) without the Const dead weight.

Topic		Replies	Views
Beginner Unsure of Next Step for solution (perhaps a foldl?) Learn	5	713	October 22, 2021
Scrap your iteration combinators Links	36	1123	May 27, 2025
Using `foldl'` with closure Learn	3	405	November 11, 2021
Implementing `cycle` with a fold Learn	5	249	February 13, 2025
Applicative Logic Links	4	1006	October 2, 2024

Foldl traverses with State, foldr traverses with anything

Related topics