Pre-HFTP: GHC should offer low-level logging infrastructure

Kleidukos · March 26, 2024, 8:59pm

Yeah I didn’t want to introduce this too quickly but it’s also something the erlang world does.

ocramz · March 26, 2024, 9:07pm

I agree it would be exfremely useful to have a single foundational logging framework that all libraries can opt in on and configure. Python does it well in this regard

gdifolco · March 26, 2024, 9:46pm

I’m not sure Erlang have it per-se, from my understanding it’s more a BEAM (Erlang VM) primitive.
GHC RTS and BEAM are quite different, BEAM is designed to be distributed, fault-tolerant, stateful and interactive, it has its own life-cycle and exists to support running “processes”.
Unlike GHC RTS which is embedded, stateless, with few configurations, and mostly abstracted at runtime.
It seems a bit impactful to have RTS configuration for IO.

Axman6 · March 26, 2024, 10:38pm

I can’t really see the point of making this a part of GHC, having one true blessed logging infrastructure would really mean that it has to be a very low level interface, essentially a Builder -> IO () or (Buffer -> IO ()) -> IO () so that it can be generic enough to support all the higher level options. And we already have those functions easily accessible. Imposing anything more opinionated on it, like timestamp formats, priorities, log formats (plain text only? JSON? What if I want CBOR?) just means others have to use the third party libraries anyway.

I would be interested to know what features are believed to be available to a GHC blessed logging library that aren’t available to libraries and apps at the moment? I don’t see logging as a particularly heavyweight library as it is, and we get excellent performance as it is. 99% of the performance wins come from producing efficient types, like Builders, and running the file writing in a separate thread, with messages timestamped in their thread and send over a Chan. I’m sure there’s more optimisations to be had, but I’d like to hear what they are before this would make any sense to me.

nh2 · March 26, 2024, 10:55pm

The same argument could be made about JSON processing, CLI argument parsing, or any other functionality that can be a library.

It is a great feature of the Haskell ecosystem that such things can be upgraded independent of the compiler version.

Using libraries should be easy. aeson is extremely common for JSON and managed to gain adoption without being put into base. streaming-commons is a popular shared-fundamentals library used by various streaming packages of different authors (e.g. bridging the conduit and pipes ecosystems).

We don’t even have fundamental types such as sets, tree maps, hash maps, etc. in base, but in libraries instead (containers, unordered-containers, which is good), and those are less opinionated than logging libraries.

So it seems better and easier to make a “logging-commons” library and convince people to use it, than to do the same and additionally get it into base, which has a slower development and evolution process.

Further, structured logging requires at least some form of nested lists and objects, which is why most structured-logging libraries depend on at least aeson or some form of containers. So putting structured logging into a base interface would require putting those dependencies into base as well.

Generally, I think it would be useful to identify what the “shared interface that can unifiy logging libraries” is, before trying to add to GHC a vaguely specified thing.

If GHC is missing concurrency basics, it should add the concurrency basics.

atravers · March 27, 2024, 12:32am

…yeah, the 2004 version of that ol’ warhorse was a real blast - it made quite an impact on the programming community.

Moreover, where does it stop? Suppose low-level logging infrastructure was added to GHC - then how can anyone legitimately reject the addition of todo to the Prelude?

Until Haskell 1.3, I/O was a substantial part of Haskell implementations…then the current interface for I/O allowed much of that complexity to be moved into the Haskell language, and no doubt helped to arrive at the relatively-stateless Glasgow RTS that exists now. As a result, the forever-growing complexity of I/O in all its ~~ugliness~~ forms can now usually be dealt with using a combination of Haskell libraries and FFI calls (along with the occasional extra primitive, if absolutely necessary) - for example, in that blogpost Michael Snoyman recommends the stm-chans package/library (because putStrLn not being thread-safe surely would be the very least of GHC’s concurrency troubles: anyone for lightweight concurrency?)

Kleidukos · March 27, 2024, 8:37am

The argument has already been made:

So I’m not sure what you really mean?

So it seems better and easier to make a “logging-commons” library and convince people to use it, than to do the same and additionally get it into base, which has a slower development and evolution process.

Ok let’s split the work then, I make the library and you spend ten years convincing people.

If GHC is missing concurrency basics, it should add the concurrency basics.

“If” ?

tomjaguarpaw · March 27, 2024, 8:54am

As a point of information, those are each implemented in a library (base), not the compiler. EDIT: However, the conversation does seem to have drifted to base rather than GHC, so perhaps I’m missing the point here…

silky · March 27, 2024, 8:54am

I’m open-minded about this in general; but I do wonder if putting something like this in GHC is a bit of a mistake because it will be very slow to change.

I have to say I’m not massively convinced by the licensing argument; I do understand the frustration but that seems like a your-org issue vs an our-community issue, which is the bar I think it would need to hit?

BurningWitness · March 27, 2024, 10:14am

When you say “fast”, is there any real bottleneck you’re hitting or is this the same kinda performance-obsessed frenzy that got us five different strict-input-only parser libraries? I would assume simply doing the bare minimum (which current libraries may or may not do) should be good enough for any use case.

The “filter/format/send/output” division outlined in the Erlang blog post does indeed seem nice, but do you have a good model for how this should look in Haskell? I don’t know if there’s a nice composeable core to this or if you are indeed heading straight for the 15th competing standard XKCD joke.

angerman · March 27, 2024, 10:50am

Just use/improve event-log or use a library. Let’s not bolt on more or bloat the stuff that’s already hard to manage in GHC. Each additional dependency has a significant tax. Adding a logging facility to GHC will bring with it stability questions around its interface as well. And then GHC has to deal with that ontop of what it already has to deal with.

Kleidukos · March 27, 2024, 11:04am

If I understand correctly, you would rather see eventlog be made modular in its backends, since it’s already available for everyone?

TeofilC · March 27, 2024, 11:32am

FWIW the RTS already exposes a C API to process the eventlog output: 5.7. Runtime system (RTS) options — Glasgow Haskell Compiler 9.8.1 User's Guide

Which was what I was referring to here:

chrisdone · March 27, 2024, 9:49pm

I’m sympathetic to seeing something that has standardized buy-in in another language, contrasting with your preferred language in which there are several competing, incompatible libraries.

TravisWhitaker · March 28, 2024, 8:07pm

I think that extending the existing eventlog mechanism already present in the RTS would be a nice way to go about this for a few different reasons:

It’s already a wired-in, very fast way of getting information about program events stored somewhere.
The debug RTS already knows how to emit debugging/profiling information to the event log, and having this in the same data stream as relevant user-defined program events would be beneficial for profiling workflows.
Existing tools already understand the eventlog format.
There’s an existing C-callback-based mechanism for extending eventlog output.

At work I’ve spent some time prototyping a system that stores streams of user events in sqlite. There’s nothing wrong with this being implemented as a library per-se, but you have to do some work to recover the desirable properties (particularly the performance properties) of an RTS-integrated solution. For example, I have to have a separate tool to consume an eventlog and insert those events in to the event stream.

michaelpj · March 28, 2024, 8:58pm

I think this is a great idea. Maybe make a GHC issue?

TeofilC · March 28, 2024, 11:22pm

Thanks! I think something like this is already possible to write as a library. So, I don’t think this requires anything from GHC. If I have some spare time soon I’ll draw up a prototype

atravers · March 30, 2024, 12:28am

Since it isn’t immediately clear to me from glancing at the associated blog post and research paper:

is the eventlog infrastructure a “permanent fixture” in the RTS?
or is it in the form of an additional library which is linked in to the final program when -eventlog is specified?

adamgundry · April 2, 2024, 7:42pm

It was originally introduced as an optional RTS “way”, with the -eventlog option required to make GHC link a version of the RTS that was built with eventlog support. But recent GHCs enable it unconditionally, because the cost of enabling it is very low and the debugging benefits significant, so it didn’t seem worth the overheads of supporting both eventlog and non-eventlog configurations. (!7983: Enable eventlog support in all ways by default · Merge requests · Glasgow Haskell Compiler / GHC · GitLab)

kephas · April 8, 2024, 10:47pm

I think it’s good to have this discussion as openly as possible, yet we as the general public may not be the best people to decide anything. If I vote for this proposal, that won’t convince the authors of Katip to switch the the proposed logging engine.

I’m very in favor of avoiding premature optimization here, especially since doing it in Haskell is the safest route. It’s easy to replace pure Haskell code in a package with calls to GHC primitives later, isn’t it?

So maybe the next phase should be to poll the authors of the biggest Haskell logging libraries about what they would need as common logging primitives and if they’d be willing to switch to them?

In parallel to that, maybe we could write a POC of that API and provide packages for a few algebraic effect libraries? I’d be happy to write polysemy-common-logging…