My knight's tour is slow

i’ve been learning haskell, and it’s been going pretty well. i went to implement a version of the ‘knights tour’ problem, wherein a path traced by a ‘chess knight’, on a NxN checkerboard, is determined. in an effort to copy the fastest imperative implementations i tried to perform ‘in-place’ operations on a fixed board. the resulting code \works\ and looks very much like my imperative implementations, but is surprisingly slow. any help would be much appreciated, as i think i’m becoming comfortable working within monadic contexts, and now i’d like to learn how to write performant haskell.

for my mutable board i used STArray, and i wrapped it in State so i could do the graph traversal.

thank you in advance.

import Control.Monad.State (execStateT, StateT, get, put, lift)
import Control.Monad.Loops (whileM)
import Control.Monad (unless)
import Control.Monad.ST (runST, ST)

import Data.STRef (newSTRef, STRef, modifySTRef, readSTRef)
import Data.Array.ST (newArray, readArray, writeArray, STArray, getBounds)

-- unless, but with a monadic context
unlessM :: Monad m => m Bool -> m () -> m ()
unlessM mbool statement = mbool >>= (\bool -> unless bool statement)


-- search over these relative points
rd :: [Int]
rd = [2,1,-1,-2,-2,-1,1,2]

cd :: [Int]
cd = [1,2,2,1,-1,-2,-2,-1]

-- `False` if any element of the board is `-1`
isSolved :: STArray s (Int, Int) Int -> ST s Bool
isSolved arr = do
    ((r0,c0),(r1,c1)) <- getBounds arr
    fmap (\x -> (not.(any (\y -> y == -1))) x) $ mapM (\i -> readArray arr i) [(idx, jdx) | idx <- [r0..r1], jdx <- [c0..c1]]


-- given the board and a location, find the locations to where the knight can travel
getNextPoints :: STArray s (Int, Int) Int -> (Int, Int) -> ST s [(Int, Int)]
getNextPoints board (idx, jdx) = do
    ((r0,c0),(r1,c1)) <- getBounds board
    let size = 1 + r1 - r0
        reachable = filter (\(i,j) -> (0<=i) && (i<size) && (0<=j) && (j<size)) [(idx+r, jdx+c) | (r, c) <- zip rd cd]
    values <- mapM (\i -> readArray board i) reachable
    return [p | (p, v) <- zip reachable values, v == -1]


-- find a tour if the knight starts at `(idx,jdx)` traversing the board held in state
-- `step` is the enumerated knight that is being placed
knightFrom :: (Int, Int) -> Int -> StateT (STArray s (Int, Int) Int) (ST s) ()
knightFrom (idx, jdx) step = do

    -- get the board from the state
    boards <- get

    -- get the size of the board
    ((r0,c0),(r1,c1)) <- lift $ getBounds boards
    let size = 1 + r1 - r0

    -- update the board to place knight `idx` at location `(idx, jdx)`
    lift $ writeArray boards (idx, jdx) step

    -- if the board is solved, return
    solved <- lift $ isSolved boards
    if solved then do
        return ()
    else do
        -- get the available points
        next_points <- lift $ getNextPoints boards (idx, jdx)

        -- act on the available points until (?) the board is solved
        mapM (\p -> unlessM (lift $ isSolved boards) (knightFrom p (step+1))) next_points

        -- backtrack if the board is not solved
        solved <- lift $ isSolved boards
        if solved then do
            return ()
        else do
            lift $ writeArray boards (idx, jdx) (-1)
            return ()

readInto2DList :: ST s (STArray s (Int, Int) a) -> ST s [[a]]
readInto2DList st_arr = do
    arr <- st_arr
    ((x0, y0), (x1, y1)) <- getBounds arr
    mapM (\j -> mapM (\i -> readArray arr (i, j)) [x0..x1]) [y0..y1]


findTour :: Int -> [[Int]]
findTour size = runST $ do
    brd <- newArray ((0, 0), (size-1, size-1)) (-1) :: ST s (STArray s (Int, Int) Int)
    readInto2DList $ execStateT (knightFrom (0,0) 0) brd
2 Likes

This appeared in a recent thread:

http://h2.jaguarpaw.co.uk/posts/good-design-and-type-safety-in-yahtzee

i absolutely appreciate the reading, i don’t think it answers my question though?

also, i did realize that mapM is not lazy - replacing that sped up my program.

I have a few ideas which may or may not help:

  • Have you tried Data.Vector (with mutable MVector)? I’ve seen a comment saying that Vector is preferable to Array. and it seems to be used more.

  • The StateT return type of knightFrom looks complex. Would refactoring it to return ST s (STArray s (Int, Int) Int) work? It feels more natural to me to put the boards array explicitly in the return value, rather than stashing it in StateT and returning ().

  • Not Haskell-specific, but isSolved may be slow. Since the knight’s tour is complete after (number of squares on the board - 1) successful steps, can you make it faster by just checking the number of steps instead?

You wouldn’t normally try using laziness extensively in e.g. C because it would make for a peculiar program (to a C programmer) and possibly a slow program, unless you know what you’re doing.

Similarly, you don’t normally try using strictness extensively in Haskell because it would make for a peculiar program, and as you’ve discovered, a slow program at that.

Now you have a slow but working program, you can start to separate out the code without side effects into regular Haskell functions (i.e. no monadic result) which are called by the remaining imperative definitions. As you do this, new opportunities to refactor code can appear - make use of them where you can!

As the linked article shows, if you’re diligent and thoughtful in this process, you should end up with very few imperative leftovers (if at all ;-).

yes, i understand now - thank you.

You should be using neither array nor vector. Instead use a Word64 to store the unoccupied squares as a bitboard.

…or use a bit array for larger boards (and you really need a fast program).

Thank you @tromp and @atravers for the advice! Yes, replacing STArray altogether sounds like a good idea.

On reflection, there are several things about STArray and its alternatives that are unclear to me. I wonder about the following questions:

  • Hypothetically, if you had to keep using STArray, how much would it be possible to improve the code? Should mutation in the ST monad be avoided, and if so, when?

  • About bitarray: I hadn’t heard of it before, and the Hackage page doesn’t seem to have been maintained recently. How widely known and used is it? If not very, do you think it should have a higher profile?

  • More generally, is there documentation about the options for Haskell data structures that are candidates for cases where you might use an array in C (that is, strings of homogeneous data, if possible fast to access and mutable)?

Another good idea is to implement a more efficient algorithm:

This is the primary way you will achieve better performance in Haskell. As for using ST and company:

…but when you go looking for one of those “concisely-expressable” algorithms, they’re almost always described imperatively! This is the legacy of imperative programming,
and it won’t be going away any time soon, at least not without a great deal of effort.

If you can find a declarative description of Warnsdorff’s algorithm, all well and good. Otherwise, my advice is this: instead of approaching this problem as a programmer, think about what a mathematician would do in the same situation - after all, this is Haskell, not C…

i very much appreciate everyone’s help - this has been a multifaceted education for me. i rewrote the code and it now runs imperceptibly fast - meaning that for small problem instances the profiler reports 0s. i also learned that a detail of this particular problem is that some problem instances are very difficult, which is why the warnsdorff heuristic is so helpful - it allows a subset of problem instances to be solved quickly. i implemented the following general improvements, informed by profiling.

  • unboxed arrays
  • cached functions
  • replace mapm when strictness was not desired
  • implemented warnsdorff
  • returned the ‘success flag’ through the recursion
  • turning on compiler optimizations eg -O1/-O2

some problem instances are still slow - but they are slow in rust as well.

i appreciate that there are mutable data structures i could use that would be better suited.

i wrote this in the state monad because, in addition to trying to reproduce an imperative algorithm, i like the state monad, but this may be a crutch. how do i think about the ‘performance benefits’ associated with writing haskell ‘mathematically’, and what does that exactly mean - does it mean avoiding monadic contexts, or avoiding do-, or avoiding do- IN monadic contexts?

i used a mutable data structure because that’s what the algorithm wants - a space to update while it searches the graph. is there a cost to using mutable data structures like eg STUArray?

my update, for posterity:

import Control.Monad.State (execStateT, StateT, get, put, lift)
import Control.Monad.ST (runST, ST)
import Control.Monad.Extra ((&&^), (||^))

import Data.Array.ST (newArray, readArray, writeArray, STUArray, getBounds, STArray)
import Data.MemoTrie (memo2)
import Data.List (sort)

type Board s = STUArray s (Int, Int) Int
data BoardConfig = BoardConfig { startBC :: (Int, Int), sizeBC :: Int, findloopBC :: Bool }

-- search over these relative locations
rd :: [Int]
rd = [2,1,-1,-2,-2,-1,1,2]

cd :: [Int]
cd = [1,2,2,1,-1,-2,-2,-1]

reachable :: Int -> (Int, Int) -> [(Int, Int)]
reachable = memo2 $ (\size -> \(idx,jdx) -> filter (\(i,j) -> (0<=i) && (i<size) && (0<=j) && (j<size)) [(idx+r, jdx+c\
) | (r, c) <- zip rd cd])

-- given the board and a location, find the locations to where the knight can travel
getNextLocs :: BoardConfig -> Board s -> (Int, Int) -> ST s [(Int, Int)]
getNextLocs bc board (idx, jdx) = let locs = reachable (sizeBC bc) (idx, jdx) in do
    values <- mapM (\i -> readArray board i) locs
    return [p | (p, v) <- zip locs values, v == -1]

solveFromPosition :: [(Int, Int)] -> Int -> BoardConfig -> StateT (Board s) (ST s) Bool
solveFromPosition [] _ _ = return False
solveFromPosition (x:xs) step bc = let
                                size = sizeBC bc
                            in do
    solved <- (return $ step == (size^2) - 1)
    if solved then
        return True
    else do
        solved <- knightFrom x (step+1) bc
        if solved then
            return True
	else do
            solved <- solveFromPosition xs step bc
            return solved

-- find a tour if the knight starts at `(idx,jdx)` traversing the board held in state
-- `step` is the enumerated knight that is being placed
knightFrom :: (Int, Int) -> Int -> BoardConfig -> StateT (Board s) (ST s) Bool
knightFrom (idx, jdx) step bc = let
                                    size = sizeBC bc
                                in do

    -- get the board from the state
    board <- get

    -- update the board to place knight `step` at location `(idx, jdx)`
    lift $ writeArray board (idx, jdx) step

    -- if the board is solved, return
    solved <- (return $ step == (size^2) - 1)
    if solved then
        return True
    else do
        -- get the available locations
        next_locs <- lift $ getNextLocs bc board (idx, jdx)

        -- Warnsdorff
        num_adj <- lift $ mapM (\p -> (fmap length) $ getNextLocs bc board p) next_locs
        let locsW = map snd $ sort $ zip num_adj next_locs

        -- act on the available locations until (?) the board is solved
        solved <- solveFromPosition locsW step bc

        -- backtrack if the board is not solved
        if solved then
            return True
        else do
            lift $ writeArray board (idx, jdx) (-1)
            return False
1 Like

My experiments with a more declarative approach got me to this picture of a knights tour on a 4x5 board:

knights-tour

The nodes are the squares over which the knight may move and the edges are the moves. The numbers in the nodes are the coordinates and the numbers on each edge is its “score” which is the sum of the degrees of the nodes it is connected to (minus two because I don’t count the edge itself, but that isn’t necessary).

A knights tour on this 4x5 board can be solved by simply going over each node and choosing the two edges with the lowest score. Perhaps a more robust way is to first choose the edge with the lowest score for each node, filtering out the edges that have become impossible due to this, and then recomputing the scores of the remaining edges.

But I doubt this is sufficient for larger boards.

Edit: Oops, the image above is not a knight’s tour. It has two distinct cycles. But you can easily make it a tour by connecting the two loops in one place, e.g.:

knights-tour2

I think bitvec is the package most people use.

1 Like