Equivalence of unfoldr and unfold

I recently learned about unfoldr, and how to use it in Haskell. This is one possible implementation of it (though not the actual one):

unfoldr :: (b -> Maybe (a, b)) -> b -> [a]
unfoldr step b = case step b of 
  Nothing -> []
  Just (a, b') -> a : unfoldr step b'

I then learned from reading this paper (“The Under-Appreciated Unfold”) that unfoldr is equivalent to this function. I added the where declarations to make the equivalence to the above version more obvious.

unfold :: (b -> Bool) -> (b -> a) -> (b -> b) -> b -> [a]
unfold done f g b
  | done b = []
  | otherwise = a : unfold done f g b'
  where
    a  = f b
    b' = g b

To show the equivalence (and for fun, since I had never once used unfoldr even once before this), I rewrote some common functions using both unfoldr and unfold. I’ll leave them at the bottom.

I don’t think I would have thought of them being equivalent if I hadn’t read it in this paper. How can you learn to see the equivalence of things like this?

After writing those functions, I read the rest of the paper. The authors wrote a bread-first tree traversal with the second definition of unfold.

data Tree a = Node a [Tree a]
type Forest a = [Tree a]
    
root (Node a ts) = a
kids (Node a ts) = ts
    
bftf :: Forest a -> [a]
bftf ts = bftf' (ts, [])
    
bftf' :: (Forest a, Forest a) -> [a]
bftf' = unfold p f g 
  where
    p (ts, vs) = null ts && null vs
    f ([], vs) = f (reverse vs, [])
    f (t : ts, vs) = root t
    g ([], vs) = g (reverse vs, [])
    g (t : ts, vs) = (ts, reverse (kids t) ++ vs)

I rewrote the function bftf' to use the standard unfoldr.

bftf' :: (Forest a, Forest a) -> [a]
bftf' = unfoldr f
  where
    f queue = case queue of
      ([], []) -> Nothing
      ([], back) -> f (reverse back, [])
      (Node root children : siblings, back) =
        -- Dequeue the current node and enqueue its children.
        let queue' = (siblings, revAppend children back)
        in Just (root, queue')
    -- revAppend xs ys == reverse xs ++ ys
    revAppend xs ys = foldl' (flip (:)) ys xs

The standard functions I rewrote are iterate, map, zip, takeWhile, (++), and filter. Many of these functions can also be written with foldr, interestingly. The list functions written with unfold seem to have an uncomfortable reliance on using head and tail; their correct usage relies on the predicate function guaranteeing that the list isn’t empty.

-- iterate f x == [x, f x, f (f x), ...]
iterate :: (x -> x) -> x -> [x]
iterate f x = unfold (const False) id f x
iterate f x = unfoldr (\x -> Just (x, f x)) x

-- map f [x1, x2, ..., xn] == [f x1, f x2, ..., f xn]
map :: (x -> y) -> [x] -> [y]
map f = unfold null (f . head) tail
map f = unfoldr $ \list -> 
  case list of
    [] ->  Nothing
    (x : xs) -> Just (f x, xs)

-- zip [x1, ..., xn] [y1, ..., yn] == [(x1, y1), ..., (xn, yn)]
zip :: [x] -> [y] -> [(x, y)]

 -- with unfold:
zip xs ys = unfold eitherNull headHead tailTail (xs, ys)
  where
    eitherNull (xs, ys) = null xs || null ys
    headHead (xs, ys) = (head xs, head ys)
    tailTail (xs, ys) = (tail xs, tail ys)

-- With unfoldr:
zip xs ys = unfoldr step (xs, ys)
  where
    step (x : xs, y : ys) = Just ((x, y), (xs, ys))
    step _ -> Nothing

-- takeWhile (< 3) [1, 2, 3, 2] = [1, 2]
takeWhile :: (x -> Bool) -> [x] -> [x]

-- With unfold:
takeWhile p = unfold done head tail 
  where 
    done [] = True
    done (x : xs) = not (p x)

-- With unfoldr:
takeWhile p = unfoldr $ \xs -> 
  case xs of
    (x : xs') | p x -> Just (x, xs')
     _ -> Nothing

-- [x1, ..., xn] ++ [y1, ..., yn] = [x1, ..., xn, y1, ..., yn]
(++) :: [a] -> [a] -> [a]

 -- With unfoldr: much less ugly than with unfold
(++) xs ys = unfoldr step (xs, ys)
  where
    -- While xs has elements, take an x from xs. 
    step (x : xs, ys) = Just (x, (xs, ys))
    -- Then, while ys has elements, take a y from ys.
    step ([], y : ys) = Just (y, ([], ys))
    -- Then, the new list is finished.
    step ([], []) = Nothing

-- With unfold: 
-- Translated by hand from the unfoldr version, 
-- and more difficult to verify as correct.
(++) xs ys = unfold done f g (xs, ys)
  where
    done (xs, ys) = null xs && null ys
    f (xs, ys) = head (if null xs then ys else xs)
    g (x : xs, ys) = (xs, ys)
    g ([], ys) = ([], tail ys)

filter :: (a -> Bool) -> [a] -> [b]
filter p = unfoldr f
  where
    f [] = Nothing
    f (x : xs) | p x = Just (x, xs)
               | otherwise = f xs

-- I haven't figured out how to write `filter` with `unfold`,
-- but I'm sure it's possible, though probably uglier.

Anyway, besides that one question I had (how can you learn to see equivalences like this?), I wrote this in large part because I thought it was pretty interesting to see the equivalence between these two functions. Isn’t it? Can you rewrite any other higher-order functions like that ?

2 Likes

To see these equivalences, I always fall back on good old algebra. The algebra in algebraic data types tells us:

Maybe A  = 1 + A
(A, B)   = A * B
Bool     = 2 (equivalent to 1 + 1)
(A -> B) = B ^ A (exponentiation)

And keep in mind currying: (A, B) -> C = A -> B -> C (or in algebraic terms: C ^ (A * B) = (C ^ B) ^ A).

Using these ingredients we can compare the types of unfold and unfoldr. First of all I’d uncurry unfold to get:

unfold :: (b -> Bool, b -> a, b -> b) -> b -> [a]

Now both types are of the form ... -> b -> [a] so we can ignore that part and compare only what is on the …, let’s rewrite both of those to algebraic formulas:

unfoldr :: (1 + a * b) ^ b -> ...
unfold :: (2 ^ b) * (a ^ b) * (b ^ b) -> ...

We can simplify the formula of unfold to (2 * a * b) ^ b using standard algebraic rules. Now we only need to show that 1 + a * b is equivalent to 2 * a * b.

That’s actually not true! However, if we convert them back to types: Maybe (a, b) and (Bool, a, b) it is pretty straightforward to implement two functions that convert between these two:

to :: Maybe (a, b) -> (Bool, a, b)
to Nothing = (False, undefined, undefined)
to (Just (x,y)) = (True, x, y)

from :: (Bool, a, b) -> Maybe (a, b)
from (False, _, _) = Nothing
from (True, x, y) = Just (x, y) 

As long as you use the types in this way, i.e. ignoring the a and b if the bool is false, you can treat the functions as equivalent.

2 Likes

Thank you for the overview on thinking of types algebraically. That also makes a lot of sense; it’s just like how logical operators correspond to arithmetic (i.e., AND is multiplication, OR is addition, etc.), and arithmetic laws apply to them as well (e.g. PQ + PR = P(Q + R)). Do you know of good resources to learn more about thinking of types algebraically (e.g., papers, books)? It makes sense, given that tuples and records are called product types and types that can be one of multiple things are called sum types; there’s already a hint at algebra that I have mostly ignored til now.

There’s a good blog post here: The algebra (and calculus!) of algebraic data types

And there’s a talk on YouTube:

3 Likes

Try expressing unfold via unfoldr and vice versa:

unfold done f g = unfoldr _
unfoldr f = unfold _ _ _

If you can fill the blanks, it means that functions are equivalent.

4 Likes
import Data.List (unfoldr)

unfold' :: (a -> (b, Maybe a)) -> a -> [b]
unfold' f = unfoldr (u . f)
    where
        u (b, Just a) = Just (b, a)
        u _           = Nothing

unfoldr' :: (b -> Maybe (a, b)) -> b -> [a]
unfoldr' f = unfold' (u . f)
    where
        u (Just (a, b)) = (a, Just b)
        u _             = (undefined, Nothing)

main = do
    print $ unfold' (\x -> (x, if x < 10 then Just (x + 1) else Nothing)) 0
    print $ unfoldr' (\x -> if x < 10 then Just (x, x + 1) else Nothing) 0

Is using undefined a valid option if you can prove you don’t reach it, or should you aim to do it like:

unfoldr' :: (b -> Maybe (a, b)) -> b -> [a]
unfoldr' f bb = case f bb of
    Nothing     -> []
    Just (x, _) -> let u (Just (a, b)) = (a, Just b)
                       u _             = (x, Nothing)
                   in unfold' (u . f) bb
2 Likes

You can show that the use of undefined is harmless by showing that it is never actually used. In this case, unfold' doesn’t use the first element of the tuple if the second element is not Just, so the undefined will never be used.

In real world code, you should be careful because a refactoring might silently change these assumptions. It would be a good idea to add clear comments explaining which parts of the code are involved in making sure that the undefined is not used.

4 Likes

It turns out that this was an exercise in this chapter from the book “The Fun of Programming,” chapter 3:
Origami Programming (ox.ac.uk). It’s on page 5 of the PDF, though it’s labeled as page 45.

This neat analysis shows that the arguments to unfold contain more information than necessary: We actually call the b -> a and the b -> b functions only for those b that don’t satisfy the stop condition b -> Bool.
Suppose you were to implement the

group :: Eq t => [t] -> [NonEmpty t]

using unfold. Here, b = [t] and a = NonEmpty t. The stop condition would be null, but then one of the other two functions must be partial: The first function Eq t => [t] -> NonEmpty t takes the longest non-empty prefix that equals the head, the second function drops that prefix.

1 Like