The state of profiling GHC compilation times

Hi. Our project’s Types.hs isn’t particularly big currently, it’s 239 lines, has 11 types (mostly newtypes) and 5 functions. It has substantial number of “deriving” clauses, but nothing that you’d expect to blow things up.

Unfortunately, compilation of that module (as listed by stack build) takes quite a long time, very noticeable compared to others, memory usage is higher as well.

I was wondering, what is the state of tools, currently available, to get more detailed information on which stages exactly are taking time? Is there any tooling available, that is somewhat more high-level than to try to build GHC itself with profiling?

Ideally, I’d like to understand if there’s something I can do, by means of either reorganizing the code, or by putting more hand-written instances, that would lighten up compilation times.

Thank you!

5 Likes

I should add two things. First, I’ve found --ghc-options="-v2" flag that indeed gives more info, it confirms that this module’s “simplification” is taking much longer, but that isn’t particularly useful.

Another is that I’ve searched GHC WIki but failed to find anything related.

1 Like

Unfortunately, deriving clauses are exactly one of the things that can blow up GHC compile times. Splitting this module out over multiple files, or handwriting some instances may help.

It also depends on the deriving strategies – plain GeneralizedNewtypeDeriving is usually very fast, but deriving instances through DeriveGeneric can be painful.

2 Likes

There was a recent blog post about “Eventful GHC”. It avoids having to recompile GHC with profiling, but it does require recompiling GHC, to enable the new events.

Myself, I have this compilation performance puzzle that I should investigate. When indulging in complex type-level shenanigans, compilation slows down significantly when an expression is given a top-level binding with an explicit type signature. If the expression is in a local binding with an inferred type, things are faster.

3 Likes

For those interested, I think I’ve found a major contributor for the compilation time in my particular case by an accident.

I’ve found that the bug is happening not only on a commercial project I’m working on, but also in a relatively small open-source one I’m making. Here’s the module that is taking quite a long time to compile: https://gitlab.com/k-bx/meetup/blob/master/backend/meetup/src/Le/Types.hs

I have another codebase that’s based on this one but is fairly larger, and that codebase doesn’t have the compilation time as bad, which gave me a hint on what’s happening. Long story short: the part that’s slowing things down is the story around import Data.Time.Zones.DB (TZLabel). Even though there’s no instance derivation happening in the module, I believe that somehow importing the type, which has lots of constructors is slowing the compilation down.

Just as an experiment, I’ve commented out the import and removed the TZLabel usage everywhere, here are the full project compilation times before and after:

# before:
make fast  36.31s user 2.89s system 148% cpu 26.360 total
# after removing the import of TZLabel from Types.hs:
make fast  19.24s user 1.56s system 152% cpu 13.654 total

Just to add more info, it’s not the import itself, but the

instance J.ToJSON TZLabel where                                                                                                                                                                                                                                                                                    
  toEncoding = J.genericToEncoding (jsonOpts 0)                                                                                                                                                                                                                                                                    

code that’s bringing the extra compilation time

Perhaps relevant GHC ticket is #5642: “Deriving Generic of a big type takes a long time and lots of space”.