EDIT: MISREAD THE BENCHMARK.
zsh time displays times horizontally on macos, so it’s actually 0.46 seconds to execute the computation on my machine, not 0.002 seconds. 0.007 seconds actually reflects the overhead from the runtime working on N4, likely with multi-threaded garbage collection.
I’m currently trying to optimize the Debian benchmarks pidigits problem in Haskell (this is just for fun, I no longer expect to make major improvements).
I’m basing my improvements off this set of code:
https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/pidigits-ghc-6.html
My version:
-- The Computer Language Benchmarks Game
-- https://salsa.debian.org/benchmarksgame-team/benchmarksgame/
-- contributed by Bryan O'Sullivan
-- modified by Eugene Kirpichov: pidgits only generates
-- the result string instead of printing it. For some
-- reason, this gives a speedup.
-- modified by W. Gordon Goodsman to do two divisions
-- as per the new task requirements.
-- further modified by W. Gordon Goodsman to reduce the memory leak
-- caused by excess laziness in computing the output string before output.
-- liam's version with foldr-for refactor.
{- cabal:
default-language: GHC2021
ghc-options: -O2 -fllvm -threaded
build-depends: base ^>= 4.20.0.0
-}
import System.Environment
pidgits :: Integer -> IO ()
pidgits n = foldr go undefined (0 # (1,0,1)) 1 9 where
go d cont i k = if i <= n
then do
putStr (show d ++ if k <= 0 then "\t:" ++ show i ++ "\n" else "")
cont (i + 1) (if k <= 0 then 9 else k-1)
else if k<9 then putStrLn (replicate k ' ' ++ "\t:" ++ show n)
else putStr ""
j # s
| n>a || q/=r = k # t
| True = q : k # (n*10,(a-(q*d))*10,d) -- inline eliminateDigit
where k = j+1
t@(n,a,d)=k&s
q=3$t
r=4$t -- two calls to extractDigit
c$(n,a,d) = (c*n+a)`quot`d -- extractDigit
j&(n,a,d) = (n*j,(a+n*2)*y,d*y) where y=(j*2+1) -- nextDigit
main :: IO ()
main = pidgits.read.head =<< getArgs
On an Apple Macbook Air 15 M3, I get 0.007 usertime if I have -N4 on, but only 0.002 usertime if I have -N1 on. Any ideas why this may be?