[Solved] How to reliably get callstacks with GHC 9.10?

Hi everyone,

While debugging a web app using Servant & postgresql-simple, I realised that I wanted the source location of functions that would throw a database exception. Unfortunately it seems that neither IPE backtraces nor HasCallStack are able to provide me with such a thing today.

  • IPE backtraces seem to be unfit for generalised use, according to ghc/ghc#25430

  • Callstacks seem to always stop right before getting into the routes record:

runApp at src/Server.hs:42:4
mkServer at src/Server.hs:33:33

I have a reproducer here, that shows what happens under which circumstances: hecate/backtrace-reproducer - Codeberg.org

To make sure I had not missed a spot, I used @parsonsmatt’s library require-callstack.

Short of manually shoeing callstacks into the exception data types like hpqtypes’s DBException does (which is obviously ridiculous because I cannot control every third-party exception type that I use, especially transitively), what are my options here?

1 Like

Have you tried building with (probably late) cost centre profiling enabled and using the cost centre stack? That will incur a bit of runtime overhead, but might give clearer results (especially since you can add your own SCC pragmas if needed, or control how many are added).

1 Like

Hi, no I haven’t. I will report with my findings. Thank you for the suggestion!

Maybe a bit more verbosely, if this is ever shown by a search engine:

  • you can enable profiling with profiling: True in you cabal.project{,.local} file, check out the cabal docs for how to influence which profiling mode is being picked
  • now you can get costcentre callstacks, if you’re using the new exceptions from ghc 9.10, the costcentre callstack should already be rendered by default when using displayException
  • if you need additional annotations, use the {-# SCC "stuff is happening" #-} pragma to annotate some expression with a costcentre
  • if you want to do something with the callstacks yourself, there are a couple of utils in GHC.Stack.CCS.
1 Like

@adamgundry Here is what late-toplevel gave me:

Cost-centre stack backtrace:
Cost-centre stack backtrace:
  Main.main (app/Main.hs:6:1-4)
  Main.main4 (<no location info>)
  Main.main1 (<no location info>)
  Server.runServer1 (<no location info>)
  Network.Wai.Handler.Warp.Run.run2 (<no location info>)
  Network.Wai.Handler.Warp.Run.run3 (<no location info>)
  Network.Wai.Handler.Warp.Run.run5 (<no location info>)
  Network.Wai.Handler.Warp.Run.$wwithII (<no location info>)
  Network.Wai.Handler.Warp.Date.withDateCache1 (<no location info>)
  Network.Wai.Handler.Warp.FdCache.$wwithFdCache (<no location info>)
  Network.Wai.Handler.Warp.FileInfoCache.$wwithFileInfoCache (<no location info>)
  Network.Wai.Handler.Warp.Run.$wacceptConnection (<no location info>)
  Network.Wai.Handler.Warp.Run.$wfork (Network/Wai/Handler/Warp/Run.hs:328:1-4)
  Server.runServer11 (<no location info>)
  Server.runServer12 (<no location info>)
  Network.Wai.Handler.Warp.Run.fork1 (<no location info>)
  Network.Wai.Handler.Warp.HTTP1.http2 (<no location info>)
  Server.runServer2 (<no location info>)
  Server.mkServer (src/Server.hs:38:1-8)
  Servant.Server.serveWithContextT (src/Servant/Server.hs:186:1-17)
  Servant.Server.Internal.hoistServerWithContext (src/Servant/Server/Internal.hs:(108,3)-(113,22))
  Servant.Server.Internal.$fHasServerTYPENamedRoutescontext (src/Servant/Server/Internal.hs:(1118,3)-(1122,42))
  Servant.Server.Internal.$w$choistServerWithContext (<no location info>)
  Servant.API.Generic.gtoServant (src/Servant/API/Generic.hs:133:5-39)
  Servant.API.Generic.$fGServantProductkM1 (src/Servant/API/Generic.hs:136:10-56)
  Server.mkServer1 (<no location info>)
  Server.$whandleDB (<no location info>)
  Database.PostgreSQL.Simple.Internal.$wconnectPostgreSQL (<no location info>)
  Database.PostgreSQL.Simple.Internal.$wloop (<no location info>)
  Database.PostgreSQL.Simple.Internal.lvl21 (<no location info>)
  Database.PostgreSQL.Simple.Internal.$wthrowLibPQError (<no location info>)
  Database.PostgreSQL.Simple.Internal.lvl20 (<no location info>)
  Server.mkServer4 (<no location info>)

Server.$whandleDB (<no location info>) has no location information, which is not very helpful.

However using the default gives me something much more readable and actionable:

  Main.main (app/Main.hs:(6,1)-(8,11))
  Server.runServer (src/Server.hs:(31,1)-(35,33))
  RequireCallStack.provideCallStack (src/RequireCallStack.hs:(128,1)-(130,40))
  Server.mkServer (src/Server.hs:(38,1)-(43,10))
  Servant.Server.serveWithContextT (src/Servant/Server.hs:(186,1)-(190,96))
  Servant.API.Generic.toServant (src/Servant/API/Generic.hs:102:1-29)
  Server.runApp (src/Server.hs:(61,1)-(64,3))
  Server.handleDB (src/Server.hs:(72,1)-(75,23))
  Database.PostgreSQL.Simple.Internal.connectPostgreSQL (src/Database/PostgreSQL/Simple/Internal.hs:(262,1)-(279,34))
  Database.PostgreSQL.Simple.Internal.connectdb (src/Database/PostgreSQL/Simple/Internal.hs:(285,1)-(308,40))
  Database.PostgreSQL.Simple.Internal.throwLibPQError (src/Database/PostgreSQL/Simple/Internal.hs:(565,1)-(567,27))
  Server.naturalTransform (src/Server.hs:(46,1)-(58,7))
  Effectful.Internal.Monad.unsafeEff_ (src/Effectful/Internal/Monad.hs:157:1-34)

Although the position of connectPostgreSQL in my own code is still missing, but that’s already a great improvement.

I’m reading in the cabal manual that the default is not late:

For GHC this uses exported-functions for libraries and toplevel-functions for executables.

Yeah I know that convention. :slight_smile:

sorry - yeah I guess it could just show the source location of the original function instead of not having any source location for the worker.

1 Like

That would indeed make much more sense

Agreed, it would be better if W/W preserved the info. There’s a ticket for this: #20015: Worker Wrapper should duplicate source notes onto the wrapper · Issues · Glasgow Haskell Compiler / GHC · GitLab

1 Like

This makes sense, and it’s a trade-off: the default inserts cost centres at the beginning of the optimization pipeline, whereas the “late” option means “insert cost centres after optimizations”. Thus the default gives a better backtrace (because cost centres correspond closely to the source code) but potentially worse performance (because cost centres inhibit optimization). So for profiling purposes, late cost centres mean much less distortion to the profile, but if you need the cost centres only for backtraces and aren’t too worried about performance, the default is better.

Looking at the docs I see there are a bunch of new -fprof... options, including -fprof-auto-calls which may given even greater fidelity?

1 Like

Ah, well-spotted. I had stuck with the options supported by cabal. I’ll try with the ones that only GHC can handle.

This is the output I get by having this in my cabal.project file:

package *
  ghc-options: -fprof-auto-calls
  profiling-detail: none
  Network.Wai.Handler.Warp.Run.runSettingsConnectionMakerSecure (Network/Wai/Handler/Warp/Run.hs:226:5-25)
  Network.Wai.Handler.Warp.Run.acceptConnection.acceptLoop (Network/Wai/Handler/Warp/Run.hs:295:9-33)
  Network.Wai.Handler.Warp.Run.fork (Network/Wai/Handler/Warp/Run.hs:330:5-38)
  Network.Wai.Handler.Warp.Run.fork (Network/Wai/Handler/Warp/Run.hs:349:35-46)
  Network.Wai.Handler.Warp.Run.serveConnection (Network/Wai/Handler/Warp/Run.hs:(387,5)-(394,44))
  Network.Wai.Handler.Warp.HTTP1.http1 (Network/Wai/Handler/Warp/HTTP1.hs:43:5-79)
  Network.Wai.Handler.Warp.HTTP1.http1 (Network/Wai/Handler/Warp/HTTP1.hs:44:5-26)
  Network.Wai.Handler.Warp.HTTP1.http1 (Network/Wai/Handler/Warp/HTTP1.hs:45:5-36)
  Network.Wai.Handler.Warp.HTTP1.http1server.loop (Network/Wai/Handler/Warp/HTTP1.hs:(141,9)-(157,42))
  Network.Wai.Handler.Warp.HTTP1.processRequest (Network/Wai/Handler/Warp/HTTP1.hs:194:5-62)
  Network.Wai.Handler.Warp.HTTP1.processRequest (Network/Wai/Handler/Warp/HTTP1.hs:(195,5)-(203,31))
  Servant.Server.serveWithContextT (src/Servant/Server.hs:187:18-76)
  Servant.Server.serveWithContextT (src/Servant/Server.hs:187:39-75)
  Servant.Server.Internal.route (src/Servant/Server/Internal.hs:135:40-88)
  Servant.Server.Internal.route (src/Servant/Server/Internal.hs:875:8-51)
  Servant.Server.Internal.methodRouter.route' (src/Servant/Server/Internal.hs:(297,14)-(305,105))
  Control.Monad.Trans.Resource.runResourceT (Control/Monad/Trans/Resource.hs:193:25-38)
  Control.Monad.Trans.Resource.runResourceT (Control/Monad/Trans/Resource.hs:193:30-37)
  Control.Monad.Trans.Resource.Internal.transResourceT (Control/Monad/Trans/Resource/Internal.hs:169:55-58)
  Servant.Server.Internal.RouteResult.>>= (src/Servant/Server/Internal/RouteResult.hs:57:45-47)
  Server.naturalTransform (src/Server.hs:48:10-38)
  Effectful.Internal.Monad.unsafeEff (src/Effectful/Internal/Monad.hs:150:20-28)

Only one of my own written functions appear (Server.naturalTransform). No idea why it doesn’t go any deeper (or rather, why the routes are not considered “deeper”).

I had a look at your code for naturalTransform, and it seems that it does the backtrace collection itself (in exceptionHandlers). So it makes sense that the backtrace doesn’t go any deeper than naturalTransform to identify the point the exception was thrown, doesn’t it?

I think what you want to see is the backtrace that was collected when the exception was thrown. This should be (implicitly) attached to the SomeException you are catching, as part of the exception context, which you can retrieve using someExceptionContext. I believe the default top-level exception handler will print this for you, but if you install your own handler, it needs to be aware of how to propagate the backtrace.

(Side note: GHC 9.12 makes significant improvements to how exceptions display attached backtraces thanks to work by @romes, see Redesign for exceptions · Issue #285 · haskell/core-libraries-committee · GitHub)

(Another side note: I haven’t experimented with -fprof-auto-calls, but perhaps it needs to be used in addition to other cost centre insertion options (e.g. -fprof-auto) rather than on its own?)

1 Like

Maybe it does! At that point my other option was to use Warp’s own setOnException, which wasn’t that much better.

This should be (implicitly) attached to the SomeException you are catching, as part of the exception context, which you can retrieve using someExceptionContext

Ah-ha, I wasn’t actively aware of exception contexts!

but if you install your own handler, it needs to be aware of how to propagate the backtrace.

In my case the custom handler is so that the exception can be logged to another service (Sentry, Elastic/OpenSearch, you name it).

Thanks a bunch @romes for your recent work. :slight_smile:

1 Like

I’m probably going to stop there, I have reached a satisfying point where I get a helpful amount of traces and source locations.

Here is the setup:

  1. In cabal.project
profiling: True

package *
  profiling-detail: none
  ghc-options: +RTS -A32m -RTS -fprof-auto -fprof-auto-calls
  1. In the code:
  • setBacktraceMechanismState CostCentreBacktrace True
  • RequireCallStack is necessary, without HasCallStack the trace gives much more details (that are useless to me) about Wai internals and the only location I get is the place of definition .

and the natural transform is:

[…]
import Effectful.Exception qualified as E
[…]

naturalTransform :: RequireCallStack => Eff '[IOE] a -> Servant.Handler a
naturalTransform action = do
  result <- liftIO $ Right <$> action
      & (`E.catches` exceptionHandlers)
      & Eff.runEff
  either Servant.throwError pure result
  where
    exceptionHandlers =
      [E.Handler $ \e@(E.SomeException exception) -> do
          let context = E.displayExceptionContext $ E.someExceptionContext e
          liftIO $ putStrLn $ "Exception: " <> E.displayException exception
          liftIO $ putStrLn $ "Context: " <> context
          pure $ Left Servant.err500
      ]

(See the reproducer at hecate/backtrace-reproducer - Codeberg.org)

Which gives me:

Exception: libpq: failed (connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory
	Is the server running locally and accepting connections on that socket?
)
Context: Cost-centre stack backtrace:
  GHC.Internal.TopHandler.runMainIO1 (<no location info>)
  Main.main (app/Main.hs:(6,8)-(8,11))
  Server.runServer (src/Server.hs:(32,13)-(37,33))
  Server.runServer (src/Server.hs:34:3-56)
  Server.runServer.server (src/Server.hs:36:16-40)
  Server.mkServer (src/Server.hs:(41,3)-(45,10))
  Server.naturalTransform (src/Server.hs:(48,27)-(52,39))
  Server.naturalTransform (src/Server.hs:(49,13)-(51,18))
  Server.naturalTransform (src/Server.hs:(49,22)-(51,18))
  Effectful.Internal.Monad.runEff (src/Effectful/Internal/Monad.hs:363:12-73)
  Server.naturalTransform (src/Server.hs:(49,22)-(50,39))
  Server.naturalTransform (src/Server.hs:50:10-38)
  Effectful.Exception.catches (src/Effectful/Exception.hs:(429,27)-(431,56))
  Effectful.Dispatch.Static.Unsafe.reallyUnsafeUnliftIO (src/Effectful/Dispatch/Static/Unsafe.hs:54:26-58)
  Effectful.Internal.Monad.unsafeEff (src/Effectful/Internal/Monad.hs:150:15-29)
  Effectful.Internal.Monad.unsafeEff (src/Effectful/Internal/Monad.hs:150:20-28)
  Effectful.Internal.Monad.>>= (src/Effectful/Internal/Monad.hs:280:36-64)
  Effectful.Internal.Monad.>>= (src/Effectful/Internal/Monad.hs:280:36-39)
  Server.handleDB (src/Server.hs:77:11-39)
  Server.handleDB (src/Server.hs:77:20-39)
  Database.PostgreSQL.Simple.Internal.connectPostgreSQL (src/Database/PostgreSQL/Simple/Internal.hs:(262,29)-(279,34))
  Database.PostgreSQL.Simple.Internal.connectPostgreSQL (src/Database/PostgreSQL/Simple/Internal.hs:263:13-29)
  Database.PostgreSQL.Simple.Internal.connectdb (src/Database/PostgreSQL/Simple/Internal.hs:(285,22)-(287,13))
  Database.PostgreSQL.Simple.Internal.connectdb (src/Database/PostgreSQL/Simple/Internal.hs:287:5-13)
  Database.PostgreSQL.Simple.Internal.connectdb.loop (src/Database/PostgreSQL/Simple/Internal.hs:(290,17)-(308,40))
  Database.PostgreSQL.Simple.Internal.connectdb.loop (src/Database/PostgreSQL/Simple/Internal.hs:(292,7)-(308,40))
  Database.PostgreSQL.Simple.Internal.connectdb.loop (src/Database/PostgreSQL/Simple/Internal.hs:293:30-69)
  Database.PostgreSQL.Simple.Internal.throwLibPQError (src/Database/PostgreSQL/Simple/Internal.hs:(565,37)-(567,27))
  Database.PostgreSQL.Simple.Internal.throwLibPQError (src/Database/PostgreSQL/Simple/Internal.hs:567:3-27)
  GHC.Internal.IO.throwIO (libraries/ghc-internal/src/GHC/Internal/IO.hs:284:1-7)
  GHC.Internal.Stack.withFrozenCallStack (libraries/ghc-internal/src/GHC/Internal/Stack.hs:101:1-19)
  GHC.Internal.Exception.toExceptionWithBacktrace1 (<no location info>)
  GHC.Internal.Exception.Backtrace.collectBacktraces (libraries/ghc-internal/src/GHC/Internal/Exception/Backtrace.hs:136:1-17)
  GHC.Internal.Exception.Backtrace.collectBacktraces1 (<no location info>)

The line Server.handleDB (src/Server.hs:73:20-39))

now refers to the place where the next function connectPostgreSQL starts getting used, and not wher handleDB itself is defined.

Bonus

You can tweak manually how the cost centers are pretty-printed, like this:

  […]
import GHC.Foreign as GHC
import Foreign.Ptr
import System.IO
[…]
    exceptionHandlers =
      [ E.Handler $ \(E.SomeException exception) -> do
          liftIO $ putStrLn $ "Exception: " <> E.displayException exception
          costCenterStackPtr <- liftIO $ getCurrentCCS ()
          costCenterMessage <- liftIO $ ccsToStrings costCenterStackPtr
          liftIO $ Text.putStrLn $ Text.intercalate "\n" costCenterMessage
          pure $ Left Servant.err500
      ]
    ccsToStrings :: Ptr CostCentreStack -> IO [Text]
    ccsToStrings ccs0 = go ccs0 []
      where
        go ccs acc
          | ccs == nullPtr = pure acc
          | otherwise = do
              cc <- ccsCC ccs
              label <- GHC.peekCString utf8 =<< ccLabel cc
              mdl <- GHC.peekCString utf8 =<< ccModule cc
              loc <- GHC.peekCString utf8 =<< ccSrcSpan cc
              parent <- ccsParent ccs
              if mdl == "MAIN" && label == "MAIN"
                then pure acc
                else
                  let entry =
                        object ["module" .= mdl, "label" .= label, "location" .= loc]
                          & encode
                          & LText.decodeUtf8
                          & LText.toStrict
                   in go parent (entry : acc)

Giving you the following:

{"label":"runMainIO1","location":"<no location info>","module":"GHC.Internal.TopHandler"}
{"label":"main","location":"app/Main.hs:(6,8)-(8,11)","module":"Main"}
{"label":"runServer","location":"src/Server.hs:(40,13)-(45,33)","module":"Server"}
{"label":"runServer","location":"src/Server.hs:42:3-56","module":"Server"}
{"label":"runServer.server","location":"src/Server.hs:44:16-40","module":"Server"}
{"label":"mkServer","location":"src/Server.hs:(49,3)-(53,10)","module":"Server"}
{"label":"naturalTransform","location":"src/Server.hs:(56,27)-(60,39)","module":"Server"}
{"label":"naturalTransform","location":"src/Server.hs:(57,13)-(59,18)","module":"Server"}
{"label":"naturalTransform","location":"src/Server.hs:(57,22)-(59,18)","module":"Server"}
{"label":"runEff","location":"src/Effectful/Internal/Monad.hs:363:12-73","module":"Effectful.Internal.Monad"}
{"label":"naturalTransform","location":"src/Server.hs:(57,22)-(58,39)","module":"Server"}
{"label":"naturalTransform","location":"src/Server.hs:58:10-38","module":"Server"}
{"label":"catches","location":"src/Effectful/Exception.hs:(429,27)-(431,56)","module":"Effectful.Exception"}
{"label":"reallyUnsafeUnliftIO","location":"src/Effectful/Dispatch/Static/Unsafe.hs:54:26-58","module":"Effectful.Dispatch.Static.Unsafe"}
{"label":"unsafeEff","location":"src/Effectful/Internal/Monad.hs:150:15-29","module":"Effectful.Internal.Monad"}
{"label":"unsafeEff","location":"src/Effectful/Internal/Monad.hs:150:20-28","module":"Effectful.Internal.Monad"}
{"label":">>=","location":"src/Effectful/Internal/Monad.hs:280:36-64","module":"Effectful.Internal.Monad"}
{"label":">>=","location":"src/Effectful/Internal/Monad.hs:280:36-39","module":"Effectful.Internal.Monad"}
{"label":"naturalTransform.exceptionHandlers","location":"src/Server.hs:65:33-57","module":"Server"}
{"label":"naturalTransform.exceptionHandlers","location":"src/Server.hs:65:42-57","module":"Server"}

which can be sent to a remote logging service as structured data.

(However this is rather truncated compared to what the exception context gives me, but Backtraces does not expose its internals, so it seems impossible to perform the conversion on what someExceptionContext gives us.)

A bit of parsing is required on your side to further normalise location fields in the form "src/Server.hs:65:33-57" into something like

{
  "file": "src/Server.hs",
  "start_line": 65,
  "end_line": 65,
  "start_column": 33,
  "end_column": 57
}

and programmatically match the content of the source file in the logging service. Maybe one day we’ll get such a structured output directly.

This is something that other languages like JS and Ruby can have through Sentry, for instance:

8 Likes