Type application, as well as Proxy-argument, has performance impact

I was stuck with a weird performance problem: With no apparent reason my runtime exploded from 127 ms to 140s, that is 1000 fold.

I narrowed the problem down and finally found it: Removing

{-# LANGUAGE AllowAmbiguousTypes #-}

and rewriting my type application with a concrete type, runtime was back to 127 ms.

I have one suspicion and a couple of questions … :slight_smile:

This is the code that gets executed roughly 1000 times (I try to reduce to the relevant stuff):

First the fast version, w/o the problem.

optimizeStenoSeries
  :: forall key
  . Palantype key
  => ByteString
  -> Result key
optimizeStenoSeries str =
  -- ...
  Trie.matches primitives str
  -- ...

-- a ByteString trie
primitives :: Trie RawSteno
primitives =
  let str = $(embedFile "primitives.json")
       primMap = case Aeson.decodeStrict str of
           -- correctly parse the file, given the  type Key
           Right m -> m :: PrimMap Key
           Left err -> -- ...
  in  Trie.fromList $ Map.toList unPrimMap primMap

In the slow version with AllowAmbiguousTypes, primitives looks like this:

primitives :: forall key. Palantype key => Trie RawSteno
primitives =
  let str = $(embedFile "primitives.json")
       primMap = case Aeson.decodeStrict str of
           -- correctly parse the file, depending on the key type-variable
           Right m -> m :: PrimMap key
           Left err -> -- ...
  in  Trie.fromList $ Map.toList unPrimMap primMap

and the function gets called by optimizeStenoSeries by means of type application:

  Trie.matches (primitives @key) str

I conclude that in the slow version, the json-file gets parsed again and again, whereas in the quick version primitives always refers to the same trie, efficiently parsed once and kept in memory.


But why?
And: Is this conclusion correct?
Could something similar happen with regular arguments instead of a type application?
How can I avoid that?

EDIT: I quickly replaced the type application with the old-school Proxy :: Proxy key argument and removed AllowAmbiguousTypes. The resulting code suffers from the same problem. I changed the title of the question, to reflect that. My questions remain the same.

1 Like

I opened a GHC issue asking about what the semantics of constraints really are. The compiler will translate constraints into a function that takes a record. E.g.

foo :: Eq a => a -> Bool
foo x = x == x

Will compile to something like:

data Eq a = EqDict { eq :: a -> a -> Bool, neq :: a -> a -> Bool }

foo :: Eq a -> a -> Bool
foo eqDict x = eq eqDict x x

This means that adding a constraint to a value turns it into a function which is called at the usage sites of that value with an automatically inferred dictionary argument based on the types. That is also what you observe with your Palantype key constraint to your primitives function. Instead of being computed once and stored as a value, it is recomputed every time the value is used.

Usually functions and values behave kind of the same in Haskell, but as you see here it can matter for performance and it also starts to matter more if you do advanced things like unlifted or unboxed types or if you involve the magical seq function as I show examples of in the linked GHC issue.

The rule is that you can almost always read => as being translated to ->, except if the left side is empty, e.g. () =>. SPJ proposes some other concrete rules in this comment.

There are also other optimizations that can influence the performance, but those are just the same as optimizations for regular functions and those happen only after all the constraints have been transformed into regular functions.

In particular in your case, it could be compiled to fast code if it would, for example, be able to lift that let binding of primMap out of the primitives function to the top level. Unfortunately, you do seem to use the Palantype key constraint in that binding, so it cannot be lifted out (or the lifted out part would require the same constraint).

What you could perhaps do is to move the primitives definition into a where clause on the optimizeStenoSeries function and lift out the str binding to a sibling definition, like this:

str = $(embedFile "primitives.json")

optimizeStenoSeries
  :: forall key
  . Palantype key
  => ByteString
  -> Result key
optimizeStenoSeries str =
  -- ...
  Trie.matches primitives str
  -- ...
  where
    primMap :: Trie RawSteno -- no constraint required
    primMap = case Aeson.decodeStrict str of
               -- correctly parse the file, given the  type Key
               Right m -> m :: PrimMap key
               Left err -> -- ...

    -- a ByteString trie
    primitives = Trie.fromList $ Map.toList unPrimMap primMap

Then the file will only be decoded once for every call to optimizeStenoSeries, but this of course means that you can no longer use primitives in other parts of your program. And perhaps this is still not fast enough if you call optimizeStenoSeries often.

2 Likes

Thanks!

Moving the primitives to a where-clause would have been perfectly fine given my code. The problem, however, persists even if I do that.

Now I wonder how to deal with performance-heavy functions in general. Googling the topic, I found the vague advise to use IO in case I need control. I.e. with IO I can assert that an expression is being evaluated only once.

EDIT: ah … the problem with moving primitives to the where-clause is this: optimizeStenoSeries is being called a thousand times. I actually moved primitives to top-level just because of that.