When I use the heap profiling option -hd, I will get a very useful list of which constructors use how much memory, but I will also get a bunch of other closures (assumedly from thunks) with very unhelpful names, usually of the form Module.Name.sat_randmLetters. For example:
How do I figure out what they correspond to and/or where they come from? I can’t find those strings in the -ddump-simpl output of the module, so I don’t know at which stage of compilation they are generated?
Here’s an example output from hp2ps with a bunch of thunks like this:
I’m not sure about -hd, but with the new info table profiling since GHC 9.2 you do actually get line numbers.
But you’ll probably want to combine that with the new -fprof-late profiling mode (or -fprof-late-inline if that is not detailed enough) introduced in GHC 9.4. Edit: Actually I’m unsure if profiling is required. That blog post I linked says it is not required, but when I last tried it I think I ran into problems without profiling but maybe that was actually caused by something else.
I haven’t updated the project to support ghc-9.2 yet, so I can’t use that option yet, but maybe this is a good motivation to put in the effort to fix the issues with the newer versions of ghc.
Searching the source code of GHC, I can find the string "sat" exactly once, in CoreToStg, so it seems like the sat_... strings pops up when creating STG code, which explains why I couldn’t find it in the dumped core.
Edit: Yep, adding -ddump-stg does show a bunch of sat_... variables, so that is probably the answer.
Edit 2: However, I don’t know how to get the stg from e.g. the base library, since I’m not compiling that from source
I would highly recommend it. Info table profiling takes a bit of learning, but I literally cannot imagine trying to debug space leaks without it now. It’s just that good.