I am not sure if we are benchmarking here the same things.
Running the benchmark tool hyperfine
, I get the following:
$ ghc -O2 Words.hs
$ hyperfine "./Words < words.txt"
Benchmark 1: ./Words < words.txt
Time (mean ± σ): 183.6 ms ± 2.2 ms [User: 180.2 ms, System: 2.6 ms]
Range (min … max): 181.3 ms … 189.5 ms 16 runs
$ hyperfine "./words.py < words.txt"
Benchmark 1: ./words.py < words.txt
Time (mean ± σ): 64.2 ms ± 1.7 ms [User: 59.1 ms, System: 5.0 ms]
Range (min … max): 61.5 ms … 69.2 ms 44 runs
Only a three time difference anymore. Maybe text < 2 spends a lot of time encoding and decoding from UTF-16? Changing it to use text-2.0 yields
$ cabal install --lib text --package-env=.
$ ghc -O2 Words.hs
$ hyperfine "./Words < words.txt"
Benchmark 1: ./Words < words.txt
Time (mean ± σ): 137.5 ms ± 2.1 ms [User: 134.7 ms, System: 2.4 ms]
Range (min … max): 133.9 ms … 142.3 ms 21 runs
gives us closer performance metrics already.
However, I agree that it seems like the toLower
function is less efficient compared to python, removing the “toLower” from both yields:
$ hyperfine "./Words < words.txt"
Benchmark 1: ./Words < words.txt
Time (mean ± σ): 50.2 ms ± 1.5 ms [User: 47.8 ms, System: 2.3 ms]
Range (min … max): 48.1 ms … 53.8 ms 59 runs
$ hyperfine "./words.py < words.txt"
Benchmark 1: ./words.py < words.txt
Time (mean ± σ): 54.3 ms ± 3.2 ms [User: 50.6 ms, System: 3.6 ms]
Range (min … max): 50.9 ms … 70.9 ms 56 runs
Maybe just a matter of a missed optimisation opportunity somewhere?
Moreover, avoiding text altogether and using Bytestring without any decoding/encoding like this:
{-# LANGUAGE BangPatterns #-}
import qualified Data.ByteString as BS
import qualified Data.ByteString.Char8 as BS8
import System.IO
import Control.Monad
main :: IO ()
main = isEOF >>= \b -> unless b $
foldr seq main . BS8.words . BS.map go =<< BS.getLine
where
go !n = case 65 <= n && n <= 90 of
True -> n + 32
_ -> n
yields better values:
$ hyperfine "./Words < words.txt"
Benchmark 1: ./Words < words.txt
Time (mean ± σ): 49.3 ms ± 1.7 ms [User: 46.4 ms, System: 2.8 ms]
Range (min … max): 46.7 ms … 55.5 ms 59 runs
However, admittedly that’s now probably not fair to python again, since we are avoiding encodings altogether. (Also, note, I did not check whether I introduced a mistake here, please take it with a grain of salt.)