I think people often want to automatically annotate all their functions. I recently figured out that you can get pretty good cost centre stack traces by building with profiling and using -fprof-callers="*.*"
. This adds the annotations after optimizations, so it does not have much overhead.
Yes, if you can afford the use of profiling mode then judiciously inserting cost centres after optimization is often the way to go. (I’ve not tried `-fprof-callers="*.*"
, do you know how it compares to -fprof-late
?). That’s often good enough in development.
But merely turning on profiling mode causes enough of a performance hit to make it infeasible in some production cases (because of the extra word in heap objects), at which point it is crucial to use methods that don’t require the profiling runtime (e.g. IPE info + annotations).
Maybe it would be possible to add these annotations to all functions using a GHC plugin (or a GHC flag that does the equivalent of the profiling one)?
I thought the main difference was that it reports the call site and not the definition site. Consider this program:
import Control.Exception.Backtrace
import Prelude hiding (div)
import qualified Prelude
div :: Int -> Int -> Int
div x 0 = error "Division by zero"
div x y = Prelude.div x y
{-# NOINLINE div #-}
foo :: Int -> Int
foo x = y1 + y2 + y3 where
y1 = div x 2
y2 = div 3 x
y3 = div x 4
{-# NOINLINE foo #-}
main = do
setBacktraceMechanismState CostCentreBacktrace True
print (foo 0)
I want the call stack to point out line that the problematic div
is called on line 14, not just that one of the div
calls in foo
is problematic.
The -fprof-auto-calls
option does do that (I’ve truncated GHC.Internal entries):
Cost-centre stack backtrace:
Main.CAF (<entire-module>)
Main.main (T.hs:(18,8)-(20,15))
Main.main (T.hs:20:3-15)
Main.main (T.hs:20:10-14)
Main.foo.y2 (T.hs:14:8-14)
Main.div (T.hs:7:11-34)
I thought -fprof-callers="*.*"
would do the same, but it seems I am mistaken:
Cost-centre stack backtrace:
Main.CAF (<entire-module>)
Main.Main.main(calling:Main.foo) (T.hs:18:1-4)
Main.Main.foo(calling:Main.div) (T.hs:12:1-3)
For comparison, here’s what -fprof-late
reports:
Cost-centre stack backtrace:
Main.CAF (<entire-module>)
Main.main (T.hs:18:1-4)
Main.foo (T.hs:12:1-3)
Main.div (T.hs:7:1-3)