Dynamic Programming in Haskell?

Hi, I am new to Haskell and have been practicing with it on some tree combinatorics problems.

One problem I practiced with was enumerating all Full Binary Trees (FBTs) with n internal nodes, so I need to have a list of all full binary trees with n internal nodes instead of just a count. I did this in two ways. The first was “fbtsDirect,” which just directly recurses for different combinations of trees to create a list of FBTs with n internal nodes. The other is “fbts,” which (as far as I know) builds lazy arrays and gets previous computations from them as they are constructed.

data FBT = Leaf | Internal FBT FBT

fbtsDirect :: Int -> [FBT]
fbtsDirect 0 = [Leaf]
fbtsDirect n = map (uncurry Internal) (pairs (n - 1))
    where    
        pairs nRem = foldl (\pairsAcc i -> combPairs (fbtsDirect i) (fbtsDirect (nRem - i)) ++ pairsAcc) [] [0..nRem]

fbts :: Int -> [FBT]
fbts n = fbtArr ! n
    where
        fbtArr = array (0, n) [(i, f i) | i <- [0..n]]
        fbtPairsArr = array ((0, 0), (n, n)) [((i, j), combPairs (fbtArr ! i) (fbtArr ! j)) | i <- [0..n], j <- [0..n]]
        f 0 = [Leaf]
        f i = map (uncurry Internal) (f' (i - 1))
        f' i = foldl (\pairsAcc i' -> fbtPairsArr ! (i', i - i') ++ pairsAcc) [] [0..i] -- I think this is the bottleneck

combPairs :: [a] -> [a] -> [(a, a)]
combPairs l1 l2 = foldl (\accTotal x1 -> 
        accTotal ++ foldl (\acc2 x2 -> 
            (x1, x2) : acc2
        ) [] l2
    ) [] l1

The issue I am finding is that the two run almost identically slow for all inputs I tried, and too slow to reasonably solve problems above around n=15. What I am confused about is why “fbts” would be as slow as the other solution.

I suspect it has most to do with combPairs, which is a function I wrote that folds over two lists to create cartesian pairs (shown in the image), and with the f’ function that adds the pairs for each pair of ints the remaining internal nodes can be split into to an accumulated list.

So, my main questions are:

  • Why are the two functions the same in speed when “fbtsDirect” re-computes trees all the time?

  • What would be the concept behind a more “Haskell” approach to solving this efficiently?

  • Is there anything you see that I am missing or misunderstanding?

As an aside, this is one of my first community posts, so please let me know if there is any etiquette to know about and if this is well-formed enough.

Thank you

Re etiquette: if you’re going to ask for help, you should make it as easy as possible for someone to help you—e.g. by providing code as code, rather than an image.

I happen to have been working on similar enumerations recently, so I’m willing to take a look at this, but shoving that image through terrible free OCR and fixing all the glitches was a pain in the arse.

Anyway, combPairs is indeed slow—swapping it out for liftA2 (,) provides quite a dramatic speedup. Cleaning up the code further, I arrive at:

data FBT = Leaf | Internal FBT FBT

type Rec a = a -> a

fbtsRec :: Rec (Int -> [FBT])
fbtsRec rec n = if n <= 0 then [Leaf] else do
  let m = n - 1
  i <- [0 .. m]
  liftA2 Internal (rec i) (rec (m - i))

fbtsDirect :: Int -> [FBT]
fbtsDirect = fix fbtsRec

fbts :: Int -> [FBT]
fbts n = fbtArr ! n
 where
  fbtArr = array (0, n) [(i, f i) | i <- [0..n]]
  f = fbtsRec (fbtArr !)

Compiling print $ length (fbts 15) with -O1 and running with +RTS -s shows a ~2.2x speedup over fbtsDirect.
If you look closer it’s actually a ~4.8x speedup in MUT time, but these gains are being blown out by the absurd amount of GC time; both versions are using a stupid amount of memory and dynamic programming techniques only make this worse.

Enumeration is streaming adjacent; rather than trying to get speedups by holding onto more data, it would be better to enumerate in constant space.
My own experiments in this area are ongoing so I can’t really give clear guidance on how to achieve it, but I’ve had the best luck with Church-encoded lists.

This implementation is much the same as the direct one, but uses a minute fraction the space (at the cost of being a bit slower):

fbtsEnum :: Int -> Enumeration FBT
fbtsEnum n = if n <= 0 then pure Leaf else do
  let m = n - 1
  i <- fromFoldable [0 .. m]
  liftA2 Internal (fbtsEnum i) (fbtsEnum (m - i))

Thanks for the feedback about the mode of sharing the code. I switched from the image to direct code.

Also, your fbtsRec implementation makes sense to me, and I thank you for introducing me to liftA2, but why do you include Rec/fix in fbtsRec and Direct?

When writing a function for the purpose of taking its fix-point, I choose to write its type signature with Rec to clarify that intent (and avoid duplication); if you just see e.g. (Int -> [Foo]) -> Int -> [Foo] in the wild, you can’t know the purpose of that function argument.

Using fix rather than writing fbtsDirect = fbtsRec fbtsDirect is something of an arbitrary choice, but you can consider it a minor follow-up to Rec.