How do I find source of closure leaks shown with `+RTS -hd`?

When I use the heap profiling option -hd, I will get a very useful list of which constructors use how much memory, but I will also get a bunch of other closures (assumedly from thunks) with very unhelpful names, usually of the form Module.Name.sat_randmLetters. For example:

<Data.Map.Internal.sat_sKHW>
<Data.IntMap.Internal.sat_sGFV>

How do I figure out what they correspond to and/or where they come from? I can’t find those strings in the -ddump-simpl output of the module, so I don’t know at which stage of compilation they are generated?

Here’s an example output from hp2ps with a bunch of thunks like this:

I’m not sure about -hd, but with the new info table profiling since GHC 9.2 you do actually get line numbers.

But you’ll probably want to combine that with the new -fprof-late profiling mode (or -fprof-late-inline if that is not detailed enough) introduced in GHC 9.4. Edit: Actually I’m unsure if profiling is required. That blog post I linked says it is not required, but when I last tried it I think I ran into problems without profiling but maybe that was actually caused by something else.

1 Like

I haven’t updated the project to support ghc-9.2 yet, so I can’t use that option yet, but maybe this is a good motivation to put in the effort to fix the issues with the newer versions of ghc.

Searching the source code of GHC, I can find the string "sat" exactly once, in CoreToStg, so it seems like the sat_... strings pops up when creating STG code, which explains why I couldn’t find it in the dumped core.

Edit: Yep, adding -ddump-stg does show a bunch of sat_... variables, so that is probably the answer.

Edit 2: However, I don’t know how to get the stg from e.g. the base library, since I’m not compiling that from source

I would highly recommend it. Info table profiling takes a bit of learning, but I literally cannot imagine trying to debug space leaks without it now. It’s just that good.

1 Like