Optimizing an simple evaluator

reuben · August 13, 2022, 8:31pm

I have a piece of code that I’d like to make faster (it’s a bit of library code that is used heavily, so investing time in optimization seems worthwhile). I suspect the answer involves fairly routine intuitions about performance in Haskell, but I’m a bit of a novice in that area.

A have a program like:

ex = do
  x <- random
  y <- random
  return (x,y)

The idea is that I’d like to “run” ex by providing a list of doubles, one for each call to random in the program. Running the program should run from the top down, and each time a random is encountered, a number should be removed from the head of the list, used as the value for random, and recorded. If the list of doubles is empty, then a value for the call to random should be drawn from an underlying source of randomness, but still recorded in the same way.

Here’s an implementation using MonadWriter and MonadState:

random = do
    trace <- get
    x <- case trace of
        [] -> underlyingSourceOfRandomness
        r : xs -> put xs >> pure r
    tell [x]
    pure x

Now my guess is that the standard WriterT monad transformer is a bad idea here, so I figured I’d use the one from Control.Monad.Trans.Writer.CPS. As for StateT, perhaps the strict variant?

I’d expect the performance of an optimally written version of this code to be linear in the number of calls to random, which I haven’t yet been able to attain.

The version in the codebase I’m working with is here at line 75. It used the Church-transformed free monad transformer to represent a program and then interprets it in using State and Writer as described above.

It’s fast, but I don’t know if it’s optimal (at least, not linear), and more importantly, I don’t really understand exactly what is making the other versions slow, other than having a vague sense that Writer space leaks.

atravers · August 14, 2022, 8:01am

runningList = myListOfDoubles ++ underlyingSourceOfRandomnessAsList

This avoids having to check for an empty list of doubles in each call to random, just as long as you can define underlyingSourceOfRandomnessAsList as a regular Haskell list (i.e. only evaluate what’s needed).

ntwilson · August 14, 2022, 10:05pm

Wouldn’t that tell [x] be non-linear? I’d expect that to be O(n) where n is the number of logs already recorded (xs ++ ys I think has to make a copy of xs to prepend onto ys). Maybe something like a DList with O(1) ++ would help get it linear? I’m guessing it might be slower though for small lists - you’d definitely want to verify that you get a real speedup.

reuben · August 14, 2022, 11:20pm

That’s a good point. I was thinking that the Writer.CPS maybe did roughly the same as a difference list, but maybe what I’ll do is to forget about State for now, and just try Writer with a DList to see what speedup there is, if any.