Profiling using 'late' cost centres (after optimization)

I thought I’d write up how you can use LateCCPlugin to get more meaningful profiling for GHC 9.2.

Currently (GHC 9.2) profiling centres are inserted before optimisation happens. This is bad because you’d want profiling not to influence the performance of your program. GHC 9.4 will fix this with an option similar to -fprof-late, but it is not yet available in 9.2. Klebinger is working on it, he created a plugin to get similar behaviour on GHC 9.2 and possibly earlier.

You need to enable profiling for all dependencies using e.g. the following snippet in cabal.project:

profiling: true
profiling-details: none

profiling-details: none is necessary because if the automatic SCC’s are inserted, nothing will get optimized. We want the plugin to insert SCC’s after optimization.

Then, add ghc-options: -fplugin=LateCCPlugin to the component that you want to profile, and make sure -rtsopts is on. I prefer rtsopts, opposed to with-rtsopts because it means you don’t need to rebuild if you want to run without profiling.

You must have the profiled component depend on the plugin: build-depends: late-cc-plugin.

The ghc-options can be set in either the cabal.project file or the cabal file stanza itself.

The plugin isn’t on Hackage yet, but you can use source-repository-package to make Cabal find it:

  type: git
  tag: 4b02365f1daeab0fa93dbc7f14e72ba8952376e0
  subdir: late-cc-plugin

(You can do a similar thing with Stack)

If you build your component now, try to see whether it also builds the plugin.

After the build, invoke your process, but add +RTS -p -l-au -RTS to the command line. The u means “user events”, and these are the only events we need because we don’t need the rest and the eventlog can grow very large. Docs for rts flags

Now, a file with the eventlog extension should have been written. You can convert this file to a renderable format using hs-speedscope.

Finally, you can view the eventlog using , it would look something like this:

Thanks to amesgen and Andreas Klebinger for helping me figuring this out.

Here is a screenshot of the speedscope:


What a time to be alive! Finally profiling will be honest in the presence of optimizations! There is even visualization!


I followed your steps and got immediate value out of it :slight_smile:

Great stuff, I really love to see content related to profiling!

Note that profiling-details is wrong, it should be profiling-detail: 7. cabal.project Reference — Cabal User's Guide