Mutable Data structures? Why so hard to find

Even if I replace M.toList . countwords by a magic function magic :: [T.Text] -> [(T.Text, Int)] that just evaluates its argument deeply and then returns a precomputed answer then the Haskell solution is still slower than the unoptimized Python solution on my machine. Python (unoptimized): 1.78 s, Haskell: 2.85 s.

If I change magic to return an empty list the whole program still takes 2.77 s, so the problem is definitely the mapping, splitting, and filtering.

If I subtract that overhead from the fastest I got using vector-hashtables: 3.71, then we get 0.86~0.94 which is respectable. Maybe that is achievable without too much work?

Update: for some reason T.toLower and T.words do not improve the speed of the preprocessing much.

3 Likes