Request: Universal Haskell value printer

Hi all,

I would like to propose either to the Haskell foundation or to the community at large to make a universal way to print any Haskell value in any context, anywhere.

As a long-term Haskell consultant, I have seen that on almost all “real world” Haskell codebases:

  • There are large nested data types.
  • It is very often the case that there is no Show instance for these types, at all. (Look at, for example, the GHC codebase. There is only a pretty printer, which omits information and generally makes it impossible to see the real values.)
  • It’s almost always impractical to go and derive Show for all the instances of all the nested types.

Approaches that fall short:

  • Show instances often omit information. Or they give a non-source-code view (e.g. Day prints as 2012-02-01).
  • Approaches that look at the heap (ghc-heap-view) show a view of the data completely alien to the normal way of dealing with Haskell values.
  • The GHCi debugger is hard to use and also does not provide fidelty with the original way a data structure was formed in code. In fact, I’ve never seen a single codebase anywhere that had ever been ran via the debugger.
  • Class-based approaches don’t work: Show / Generic / Data``/Lift – anything that depends on the availability of a class is inherently not usable because it requires cooperation from a developer in the past to ensure these classes are implemented (correctly) for all types.
  • The Lift class is sort of related: with Lift, you can go from a normal Haskell value to an AST. In some ways, Lift is a more faithful implementation of the not-really-followed rule-of-thumb that Show should yield valid source code.

None of them deal with the fact that there are usually a couple valid presentations – the structural representation of the object, and the abstract data type version. Both are valid, and providing only one deprives the programmer.

Some prior work by me:

  • In The printer Haskell deserves (2014), I explored a simple way to use Data.Data, with mentioned downsides.
  • In the present package I explored a way to derive such a printer for all types using template-haskell, requiring no prior instances, generating a lazy AST, capable of being shown as Show-like or JSON or whatever, and able to penetrate abstract data types like ByteString or Map to give alternative presentations as either e.g. a pointer+len+offset, or “like a string”, or as a tree in the case of Map. You could also provide additional presentations for a known type by implementing an instance of a class for it–as an optional user-extendable enhancement, not as a base. It only depended on base and template-haskell, so it was easy to include in any project. I got stuck on higher kinded types, as an implementation detail and overall I’m not sure whether it’s the right long-term approach.

GADTs, data families, and other more exotic types present a challenge, but even a best-effort would be valuable there.

What I woud like to have, either as a compiler plugin, an extension, template haskell, or some other means is something that lets me say: I have an expression or a type of expression, now give me a function that can print it in a uniform way. I should never be in the middle of working on a codebase, have a value whose type I know, and be stopped dead in my tracks when I want to see what’s inside it.

Because it’s such a fundamental problem, I’m inclined to lean towards even a language extension that would “standardise” it.

If you are in agreement that this is a problem to be solved, and you are:

  • Someone with lots of time on their hands who wants to work on a Haskell project.
  • The Haskell Foundation
  • Someone else with interest in improving Haskell

Then please reply to this topic.

20 Likes

I meet the criteria for responding to this post.

Unfortunately, I don’t have a good solution in mind. I think there are a few different use cases, which might want things in different formats. One mode might be ala ghc-vis, where it doesn’t force evaluation of thunks, but rather shows what’s going on with them. Another mode may force the thunks and get ride of the extra information about them, and just show the internals of the data structure.

Being able to look at data structures without requiring Show instances of the bits for purposes of debugging would help with making tools for debugging in a big way, though, so I’m interested in helping the effort.

2 Likes

I’ve been wanting a project to motivate me to dig deeply into GHC internals, so count me as interested.

2 Likes

recover-rtti might be relevant with its anythingToString function.

1 Like

Hot off the presses: GHC is starting to be debugged in GHCi: https://mail.haskell.org/pipermail/ghc-devs/2021-October/020345.html

Not saying this is a solution to your problem, but I thought it was interesting when, last week, I would have agreed with your assessment about no projects being run through the GHCi debugger, but this week I would not.

6 Likes

I hadn’t heard of this package. There are a lot of commonalities (extensibility, producing a tree) that I think are valuable! I’ll give it a try and see where the rough edges are.

The lack of record fields is painful, as field names are there to make values digestible. But perhaps they could be recovered somehow.

That’s an interesting recent development! Thanks for sharing.

One thing that pops up in my mind is how to debug multithreaded apps. Perhaps there’s a story there that I’m not aware of.

Right, ghc-viz is very cool but is also at a lower level of detail than we are typically interested in for debugging purposes rather than e.g. performance purposes.

I think that the recover-rtti is the one level up library that I didn’t know existed!

Maybe PureScripts attempt to replace Show with a Debug type class is interesting. Debug is derivable for every data type and produces Doc-like trees instead of strings. They can even be diffed and than printed to the console.

4 Likes

This is great a great development! Not because I want to debug GHC with ghci but because it looks like ghci debugging is really n for some TLC.

Edsko created recover-rtti for Well-Typed and Juspay to solve pretty much exactly the problem of printing values without access to a Show instance. Perhaps we need to get better at publicizing it…

The lack of record fields is painful, as field names are there to make values digestible. But perhaps they could be recovered somehow.

In principle making record field information available to recover-rtti seems plausible. It already has access to the package/module/constructor names via ghc-heap, so it would just need somewhere to store a mapping from constructors to the field names they use.

3 Likes