Help with purses problem

hello, i have been very much enjoying the process of learning haskell, and i have a question about a ‘coding challenge’-type problem i started. it is a variation on the purse problem: find the smallest collection of coins that sums to a given value, given standard denominations. the problem resolves to tracing a combinatorial graph, but pruning and memoization make it tractable. unsurprisingly the brute force solution can be written in a few lines of haskell, but i am having trouble implementing the optimizations. to help, i wrote a solution in python:

import copy
from typing import List, Set, Tuple
def get_purses(purse: List[int],
               value: int,
               min_len: int=1e10,
               path: List[int]=None,
               visited: Set[int]=None) -> Tuple[int, List[List[int]]]:
    """                                                                                                                                             
    :param purse: the available denominations, eg [1, 5, 10, 25] for american currency.                                                             
    :param value: the target value.                                                                                                                 
    :param min_len: the smallest collection thus found.                                                                                             
    :param path: the current collection going down a branch.                                                                                        
    :param visited: all values that have thus far been calculated.                                                                                  
    """
    # coins must be used greedily
    purse = sorted(purse)[::-1]

    if path is None:
        path = []
    if visited is None:
        visited = set()

    if value == 0:
        min_len = min(min_len, len(path))
        return (min_len, [copy.copy(path)])
    elif len(path) >= min_len or value in visited:
        return (min_len, [])
    visited.add(value)
    all_paths = []
    for p in filter(lambda x: value - x >= 0, purse):
        path.append(p)
        (min_len, new_path) = get_purses(purse, value - p, min_len, path, visited)
        all_paths.extend(new_path)
        path.pop()

    return (min_len, all_paths)

it works perfectly fine, so as a first pass i tried to implement those same optimizations into haskell, but something doesn’t work! that is, the code compiles and finds the correct answer, but much more slowly, implying that i am doing something wrong.

import Data.List (sort)
import Data.Set (Set, insert, fromList)

composeFunctions :: [(a -> a)] -> (a -> a)
composeFunctions funcs = foldl (\f0 -> \f1 -> (f0 . f1)) (\x -> x) funcs

getPursesExpanded :: [Integer] -> Integer -> Integer -> [Integer] -> Set Integer -> (Integer, [[Integer]])
getPursesExpanded purse value min_len path visited
    | value == 0 = (min min_len (toInteger $ length path), [path])
    | (toInteger $ length path) >= min_len || elem value visited = (min_len, [])
    | otherwise = (min_min, all_paths)
    where new_visited = insert value visited
          limited_purse = filter (\x -> value - x >=0) purse
          functions_to = map (\x -> \(new_min, _) -> getPursesExpanded purse (value - x) new_min (x:path) new_visited) limited_purse
          composed_fs = [composeFunctions (reverse $ take n functions_to) | n <- [1..(length limited_purse)]]
          applied_fs = [f (min_len, []) | f <- composed_fs]
          min_min = minimum [fst x | x <- applied_fs]
          all_paths = foldl (\x -> \y -> (x ++ (snd y))) [] applied_fs


findFirstOfLength :: (Integer, [[Integer]]) -> [Integer]
findFirstOfLength (min_len, (x:xs))
    | (toInteger $ length x) == min_len = x
    | otherwise = findFirstOfLength (min_len, xs)


getPurse :: [Integer] -> Integer -> [Integer]
getPurse purse value = findFirstOfLength $ getPursesExpanded (reverse $ sort purse) value (10^10 :: Integer) [] (fromList [])

one problem i had, generally, was trying to update the minimum length path so i could pass the most recent version into subsequent searches. i tried to do this by function composition… which seems crazy… but it worked. i know there is the State monad, and i was going to use this opportunity to learn that after i resolved this.

thank you!

One small thing I noticed is that you are using foldl to compose list of functions. I forgot how continuation optimization works, but I think it should be better if you used foldr instead of foldl. Ofc, it might not help as well.
Perhaps composing functions like that using foldl is not a good idea - My brain says it may leave unevaluated thunks leading to space leak, but again, i forgot how it goes.

My guess is that the Python visited set is shared over multiple branches, whereas in the Haskell version each recursive call gets its own copy. Hence the Haskell version is exploring many more possibilities than the Python version.

There are also a few other “notable” inefficiencies including foldl as previously mentioned, but for relatively small examples they shouldn’t be that bad.

that seems right! how do i fix that?

ok, i think i just need to send visited along throughout the traversal! thank you!

1 Like

I don’t think the Python version works fine:

>>> purses.get_purses([3,5,7], 24)
(6, [[3, 3, 3, 3, 3, 3, 3, 3], [3, 3, 3, 3, 5, 7], [3, 3, 3, 3, 7, 5]])

It has missed [5,5,7,7] (amongst other orderings).

it shouldn’t give all the paths, since it’s pruning, but it’s weird you get something different - i get [3, 7, 7, 7] among a few other paths, all the same length. i think there is something subtle about the order that the coins are in - if the coins are NOT in descending order, then the function can get to a value suboptimally - the solutions must be attempted greedily. i think this is another problem in the haskell code, wherein i can’t enforce the order in which my functions are called.

i realized this last night… i was already doing it in python with a helper function, but i didn’t realize it was critical.

I suggest you just copy the Python code as closely as you can. There are a few rules:

  1. You never reassign variables. Instead make new variables.
  2. You can’t use default arguments. I suggest using a wrapper function instead (and I would actually do the same in Python – I don’t like default arguments).
  3. You can’t mutate. Whenever you need mutation across a function boundary or around a loop iteration pass an old value in and a new value out instead.

In return we get some benefits:

  1. We never have to copy things that would otherwise be mutated (e.g. we don’t have to copy.copy(path)).
  2. We can explicitly see when things are mutated (which perhaps fixes a bug? – see below).

With those principles in mind we can write a Haskell version (at the bottom). You can see it is very similar to the Python! Programming in Haskell really isn’t all that different from immutable programming in Python, contrary to what one might expect. The two programs give the same result:

cabal run && python3 purses.py
...
(4,[[7,7,7,3],[7,7,5,5],[7,7,3,7]])
(4, [[7, 7, 7, 3], [7, 7, 5, 5], [7, 7, 3, 7]])

What substantive changes did we have to make to the Python?

  1. We had to explicitly pass visited out of each recursive call to get_purses' (because it is mutated by visited.add(value)).
  2. We had to rewrite the loop to a foldl', explicitly passing in and out the variables mutated by the loop: min_len, all_paths and visited.

Interesting observation: The fact that you are passing visited in as a parameter to the recursive call makes me think you don’t actually intend to mutate it. If so, this a big win for immutable style! The explicitness made it clear you were doing something you didn’t intend to. If we simulating mutation of visited (i.e. we stop returning it as a result from get_purses') then we get a result that might be better, at least there are more correct results:

(4,[[7,7,7,3],[7,7,5,5],[7,7,3,7],[7,5,7,5],[7,5,5,7],[7,3,7,7],[5,7,7,5],[5,7,5,7],[5,5,7,7],[3,7,7,7]])

Hope that helps, and please reply to let me know whether the mutation of visited is intentional or not.

Faithful Haskell implementation:

module Main where

import qualified Data.Set as Set
import Data.List (foldl')

get_purses :: (Num a, Ord a) => [a] -> a -> (Int, [[a]])
get_purses purse value =
  let (min_len, all_paths, _visited) =
        get_purses' purse value (10 * 1000 * 1000 * 1000) [] Set.empty
  in (min_len, all_paths)

get_purses' :: (Num a, Ord a)
            => [a] -> a -> Int -> [a] -> Set.Set a -> (Int, [[a]], Set.Set a)
get_purses' purse value min_len path visited =
  if value == 0 then
     let min_len' = min min_len (length path)
     in (min_len', [path], visited)

  else if length path >= min_len || value `elem` visited then
     (min_len, [], visited)
  else
     let visited' = Set.insert value visited
         all_paths_init = []
         f (min_len', all_paths, visited'') p =
           let path' = path ++ [p]
               (min_len'', new_path, visited''') =
                 get_purses' purse (value - p) min_len' path' visited''
               all_paths' = all_paths ++ new_path
           in (min_len'', all_paths', visited''')
     in foldl' f (min_len, all_paths_init, visited') (filter (\x -> value - x >= 0) purse)

main :: IO ()
main = print (get_purses [7, 5, 3] 24)

Haskell implementation without returning visited, perhaps more correct?

module Main where

import qualified Data.Set as Set
import Data.List (foldl')

get_purses :: (Num a, Ord a) => [a] -> a -> (Int, [[a]])
get_purses purse value = get_purses' purse value (10 * 1000 * 1000 * 1000) [] Set.empty

get_purses' :: (Num a, Ord a)
            => [a] -> a -> Int -> [a] -> Set.Set a -> (Int, [[a]])
get_purses' purse value min_len path visited =
  if value == 0 then
     let min_len' = min min_len (length path)
     in (min_len', [path])

  else if length path >= min_len || value `elem` visited then
     (min_len, [])
  else
     let all_paths_init = []
         f (min_len', all_paths) p =
           let path' = path ++ [p]
               (min_len'', new_path) =
                 get_purses' purse (value - p) min_len' path' visited
               all_paths' = all_paths ++ new_path
           in (min_len'', all_paths')
     in foldl' f (min_len, all_paths_init) (filter (\x -> value - x >= 0) purse)

main :: IO ()
main = print (get_purses [7, 5, 3] 24)

this was very helpful, thank you everyone. one surprising aspect of learning haskell has been that i have been forced to think critically about what i am trying to accomplish in even my imperative code.

there was a mistake in my python code! i think the visited object, as written, would have worked for BFS, but not for DFS. i rectified this by caching something different. i’ve edited my original python code ^.

as @tomjaguarpaw suggested, i was more careful translating this to haskell, and in doing so learned

  1. i was relying on an implicit, global, mutation of visited, due to python’s pass-by-object-reference.
  2. for a greedy algorithm, i have to control the order in which functions are applied with recursion - otherwise haskell laziness results in an indeterminate ordering.

for reference, here is the final code:

applyFunctions :: [(a -> a)] -> [a] -> [a]
applyFunctions [] results = reverse results
applyFunctions (f:fs) results = applyFunctions fs ((f $ head results):results)

-- for getting things from tuples.
fst3 (a,_,_) = a
snd3 (_,b,_) = b
thd3 (_,_,c) = c

lengthInt :: [a] -> I_t
lengthInt arr = toInteger $ length arr

getPursesExpanded :: [I_t] -> I_t -> I_t -> [I_t] -> Map I_t I_t -> (I_t, [[I_t]], Map I_t I_t)
getPursesExpanded purse value min_len path visited
    | value == 0 = (min min_len (lengthInt path), [path], visited)
    | ((lengthInt path) >= min_len
       || ((member value visited) && (visited ! value <= (lengthInt path)))) = (min_len, [], visited)
    | otherwise = (min_min, all_paths, all_visited)
    where limited_purse = filter (\x -> value - x >= 0) purse
          functions_to = map (\x -> \(new_min, _, new_visited) -> getPursesExpanded purse
                                                                                    (value - x)
                                                                                    new_min
                                                                                    (x:path)
                                                                                    new_visited) limited_purse
          applied_fs = tail $ applyFunctions functions_to [(min_len, [], insert value (lengthInt path) visited)]
          min_min = minimum (min_len:[fst3 x | x <- applied_fs])
          all_paths = foldl (\x -> \y -> (x ++ (snd3 y))) [] applied_fs
          all_visited = head $ reverse (visited:(map thd3 applied_fs))
          inputs = (purse, value, min_len, path, visited, limited_purse, length functions_to)

@tomjaguarpaw i like your rules for writing ‘functional’ python!

fixed python:

import copy
from typing import List, Set, Tuple, Dict
def get_purses(purse: List,
               value: int,
               min_len: int=1e10,
               path: List[int]=None,
               visited: Dict[int, int]=None) -> Tuple[int, List[List[int]], Dict[int, int]]:
    """                                                                                                                                                                                               
    :param purse: the available denominations, eg [1, 5, 10, 25] for american currency.                                                                                                               
    :param value: the target value.                                                                                                                                                                   
    :param min_len: the smallest collection thus found.                                                                                                                                               
    :param path: the current collection going down a branch.                                                                                                                                          
    :param visited: all values that have thus far been calculated.                                                                                                                                    
    """
    purse = sorted(purse)[::-1]

    if path is None:
        path = []
    if visited is None:
        visited = dict()
    new_visited = copy.copy(visited)

    if value == 0:
        min_len = min(min_len, len(path))
        return (min_len, [copy.copy(path)], copy.copy(new_visited))
    elif len(path) >= min_len or new_visited.get(value, 1E10) <= len(path):
        return (min_len, [], copy.copy(new_visited))

    new_visited[value] = len(path)
    all_paths = []
    for p in filter(lambda x: value - x >= 0, purse):
        (min_len, new_path, new_visited) = get_purses(purse, value - p, min_len, copy.copy(path) + [p], copy.copy(new_visited))
        all_paths.extend(new_path)

    return (min_len, all_paths, copy.copy(new_visited))