Request: Universal Haskell value printer

chrisdone · November 2, 2021, 5:02pm

Hi all,

I would like to propose either to the Haskell foundation or to the community at large to make a universal way to print any Haskell value in any context, anywhere.

As a long-term Haskell consultant, I have seen that on almost all “real world” Haskell codebases:

There are large nested data types.
It is very often the case that there is no Show instance for these types, at all. (Look at, for example, the GHC codebase. There is only a pretty printer, which omits information and generally makes it impossible to see the real values.)
It’s almost always impractical to go and derive Show for all the instances of all the nested types.

Approaches that fall short:

Show instances often omit information. Or they give a non-source-code view (e.g. Day prints as 2012-02-01).
Approaches that look at the heap (ghc-heap-view) show a view of the data completely alien to the normal way of dealing with Haskell values.
The GHCi debugger is hard to use and also does not provide fidelty with the original way a data structure was formed in code. In fact, I’ve never seen a single codebase anywhere that had ever been ran via the debugger.
Class-based approaches don’t work: Show / Generic / Data``/Lift – anything that depends on the availability of a class is inherently not usable because it requires cooperation from a developer in the past to ensure these classes are implemented (correctly) for all types.
The Lift class is sort of related: with Lift, you can go from a normal Haskell value to an AST. In some ways, Lift is a more faithful implementation of the not-really-followed rule-of-thumb that Show should yield valid source code.

None of them deal with the fact that there are usually a couple valid presentations – the structural representation of the object, and the abstract data type version. Both are valid, and providing only one deprives the programmer.

Some prior work by me:

In The printer Haskell deserves (2014), I explored a simple way to use Data.Data, with mentioned downsides.
In the present package I explored a way to derive such a printer for all types using template-haskell, requiring no prior instances, generating a lazy AST, capable of being shown as Show-like or JSON or whatever, and able to penetrate abstract data types like ByteString or Map to give alternative presentations as either e.g. a pointer+len+offset, or “like a string”, or as a tree in the case of Map. You could also provide additional presentations for a known type by implementing an instance of a class for it–as an optional user-extendable enhancement, not as a base. It only depended on base and template-haskell, so it was easy to include in any project. I got stuck on higher kinded types, as an implementation detail and overall I’m not sure whether it’s the right long-term approach.

GADTs, data families, and other more exotic types present a challenge, but even a best-effort would be valuable there.

What I woud like to have, either as a compiler plugin, an extension, template haskell, or some other means is something that lets me say: I have an expression or a type of expression, now give me a function that can print it in a uniform way. I should never be in the middle of working on a codebase, have a value whose type I know, and be stopped dead in my tracks when I want to see what’s inside it.

Because it’s such a fundamental problem, I’m inclined to lean towards even a language extension that would “standardise” it.

If you are in agreement that this is a problem to be solved, and you are:

Someone with lots of time on their hands who wants to work on a Haskell project.
The Haskell Foundation
Someone else with interest in improving Haskell

Then please reply to this topic.

myShoggoth · November 2, 2021, 5:38pm

I meet the criteria for responding to this post.

Unfortunately, I don’t have a good solution in mind. I think there are a few different use cases, which might want things in different formats. One mode might be ala ghc-vis, where it doesn’t force evaluation of thunks, but rather shows what’s going on with them. Another mode may force the thunks and get ride of the extra information about them, and just show the internals of the data structure.

Being able to look at data structures without requiring Show instances of the bits for purposes of debugging would help with making tools for debugging in a big way, though, so I’m interested in helping the effort.

jhenahan · November 2, 2021, 6:11pm

I’ve been wanting a project to motivate me to dig deeply into GHC internals, so count me as interested.

jaror · November 2, 2021, 6:13pm

recover-rtti might be relevant with its anythingToString function.

rae · November 2, 2021, 7:33pm

Hot off the presses: GHC is starting to be debugged in GHCi: https://mail.haskell.org/pipermail/ghc-devs/2021-October/020345.html

Not saying this is a solution to your problem, but I thought it was interesting when, last week, I would have agreed with your assessment about no projects being run through the GHCi debugger, but this week I would not.

chrisdone · November 2, 2021, 8:51pm

I hadn’t heard of this package. There are a lot of commonalities (extensibility, producing a tree) that I think are valuable! I’ll give it a try and see where the rough edges are.

The lack of record fields is painful, as field names are there to make values digestible. But perhaps they could be recovered somehow.

chrisdone · November 2, 2021, 8:53pm

That’s an interesting recent development! Thanks for sharing.

One thing that pops up in my mind is how to debug multithreaded apps. Perhaps there’s a story there that I’m not aware of.

chrisdone · November 2, 2021, 9:00pm

Right, ghc-viz is very cool but is also at a lower level of detail than we are typically interested in for debugging purposes rather than e.g. performance purposes.

I think that the recover-rtti is the one level up library that I didn’t know existed!

timjs · November 3, 2021, 6:47am

Maybe PureScripts attempt to replace Show with a Debug type class is interesting. Debug is derivable for every data type and produces Doc-like trees instead of strings. They can even be diffed and than printed to the console.

chris · November 3, 2021, 8:05am

This is great a great development! Not because I want to debug GHC with ghci but because it looks like ghci debugging is really n for some TLC.

adamgundry · November 3, 2021, 8:42am

Edsko created recover-rtti for Well-Typed and Juspay to solve pretty much exactly the problem of printing values without access to a Show instance. Perhaps we need to get better at publicizing it…

The lack of record fields is painful, as field names are there to make values digestible. But perhaps they could be recovered somehow.

In principle making record field information available to recover-rtti seems plausible. It already has access to the package/module/constructor names via ghc-heap, so it would just need somewhere to store a mapping from constructors to the field names they use.

Topic		Replies	Views
[ANN] text-display 0.0.1.0: A typeclass for user-facing output Announcements	6	1097	January 3, 2022
GHCi uses the Show instance too eagerly Learn	1	322	July 26, 2024
Serokell’s Work on GHC: Dependent Types, Part 4	5	1141	October 27, 2024
A standard library for semi-fancy types? Learn	7	1481	November 4, 2023
[Solved]Question about how `show` deal with non-English character Learn	3	785	November 15, 2023

Request: Universal Haskell value printer

Related topics