How do I make a function of type `[a] -> IO ()` print each element of type `a` as soon as possible?

Let’s say I have a type called BenchmarkResult and a couple of values

bench1, bench2, bench3 :: IO [BenchmarkResult]

I also have this function that pretty prints [BenchmarkResult]

printBenchmarkResults :: [BenchmarkResult] -> IO ()

My main program is something like this:

main :: IO ()
main = printBenchmarkResults . concat =<< 
        (fmap concat . replicateM nTrials $ sequence [
            bench1, bench2, bench3
            ])

It seems like when I run this program, the results are printed onlu at the end when all the benchmarks are done. Is there a way to print the results as they come in? In other words, I want to be able to see the result of each BenchmarkResult as soon as it is done.

1 Like

I think you are looking for mapM_ function. If you have list of values mapM_ print list_of_vals will print them. Notice that the list is of type Show a => [a] not IO [a]

2 Likes

I think this is a problem with lazy I/O, a feature of Haskell which is known to cause this kind of problem. The quick fix is to use unsafeInterleaveIO from System.IO.Unsafe:

main :: IO ()
main = printBenchmarkResults . concat =<< 
        (fmap concat . replicateM nTrials $ sequence $
            map unsafeInterleaveIO [
            bench1, bench2, bench3
            ])

Some people might advise against this because unsafeInterleaveIO can have unpredictable results (see this Stack Overflow question for details). Although the question in that link is more than 10 years old, I think that the advice to consider using a streaming I/O library like conduit or pipes is still relevant.

Your case is so simple that unsafeInterleaveIO is probably fine, but I don’t know for sure.

1 Like

Another way to fix this is to move the printing closer to the bench functions:

main = replicateM_ nTrials
  $ traverse_ printBenchmarkResults [bench1, bench2, bench3]

Or perhaps even better, keep the replication pure:

main = traverse_ printBenchmarkResults
  $ concat $ replicate nTrials [bench1, bench2, bench3]
4 Likes

The problem is not with

printBenchmarkResults :: [BenchmarkResult] -> IO ()

but with

bench1, bench2, bench3 :: IO [BenchmarkResult]

Since Haskell lists are lazy linked-links, they are always evaluated one element at the time (or at least this is an useful approxmation).

The problem is how this list is produced. When you say “as they come in” you are implying they come in one at the time but the type of the bench functions tells that the list of BenchmarkResult comes in all at once, when the IO action is finished. The lazy part means that some pure computation inside BenchmarkResult could still be delayed until it’s forced by printBenchmarkResults but the sequencing of IO actions is not going to go magically away.

What do to then?

If you just want to run some benchmarks, perhaps the best course of action is to use an already availabe benchmarking library like tasty-bench, which is quite easy to use.

If you want to learn to deal with the problem of interleaving lists and IO, look into a streaming library like streaming, conduit or pipes.

If you want to understand this problem in depths, read about “iteratees” and “ListT done right”.

Please refrain from using unsafePerformIO :slight_smile:

I hope this helps, let me know if it does not.

2 Likes

Thanks for refering me to these libraries.

Could you point me to some small example where there is a producer which produces [a] (or something equivalent to a stream of a) (inside IO) which is consumed by the consumer and processes as soon as it comes in?

1 Like

Using streaming as it’s my favourite library between the ones I have mentioned.

module Main where

import Data.Foldable (for_)
import Data.Function ((&))
import Streaming
import qualified Streaming.Prelude as S

newtype BenchResult = BenchResult Int
  deriving (Show)

runBench :: IO BenchResult
runBench = undefined

runBenchWithParams :: Int -> IO BenchResult
runBenchWithParams = undefined

-- read the type Stream (Of a) m r as
-- a stream of a's computed in m, returning r when it's finished
-- Stream (Of a) m is a Monad do you can use the do notation to
-- concatenate streams.
bench1 :: Stream (Of BenchResult) IO ()
bench1 = do
  result1 <- liftIO runBench
  S.yield result1

  result2 <- liftIO $ runBenchWithParams 1
  S.yield result2

  --- standard combinators work ok
  for_ [2 .. 10] $ \n -> do
    resultn <- liftIO $ runBenchWithParams n
    S.yield resultn

  -- the streaming library as a bunch of combinators too:
  -- this makes a Stream (Of Int) m ()
  S.each [11 .. 20]
    -- this feeds every element to a computation in m (this case IO)
    & S.mapM runBenchWithParams

main :: IO ()
main = do
  -- consuming the stream is very similar to S.mapM but we don't create another stream,
  -- so there is another combinator S.mapM_
  S.mapM_ print bench1
  -- or simply
  S.print bench1

Ask away if you have any questions! Streaming has excellent documentation (if anything it’s a bit overwhelming!)

2 Likes