Pre-HFTP: GHC should offer low-level logging infrastructure

Thanks, the “performance boost not available to code written as a library” is the part that I would particularly welcome further details about. I guess it has something to do with “one unified process receives all these logging messages from the other processes” where “process” here is something specific to Erlang’s runtime, and we’d have something similar in GHC’s runtime. Is that right?

I don’t believe hiding the dependency as “part of GHC” is benefitting this.

Please elaborate

Please elaborate

yes, process as in green thread, like what we have in GHC’s RTS today. :slight_smile:

1 Like

snarky XKCD about something different

Please read the bit where I say that we retain our various higher-level interfaces

You’ve now moved from “packages may have a dependency on a logging library if they need it” to “all packages now have a dependency on a logging library because it’s part of GHC”. Not sure how this is beneficial.

Similarly, “the licenses on these libraries aren’t compatible with <project>, so now we’ve just rewritten the hypothetical licenses to all be the same as the GHC license”. Again, not sure what is gained here.

Please read the bit where I say that we retain our various higher-level interfaces

I never thought that the higher-level interfaces would go away, but 2 of the 3 points are in this higher-level interface space and therefore are not solved by this.

Similarly, “the licenses on these libraries aren’t compatible with , so now we’ve just rewritten the hypothetical licenses to all be the same as the GHC license”. Again, not sure what is gained here.

Putting together a product with a mixed bag of licenses is not always a nice experience if you have a legal department that enforces compliance with some (internal) standard. Since GHC is already validated (because the idea is to introduce Haskell, or it has already been introduced), same license would reduce headache.

1 Like

Okay, but now you:

  • have the same problem if you want to use a higher-level interface (because you still have license hell).
  • are rolling an inhouse logger from low level primitives which you probably did anyway if you had problems with the higher-level interfaces.

Both of these seem like ecosystem problems, not “we need more things in GHC” problems.

I’ve been thinking about something similar recently, although I’m not sure if it’s exactly what you have in mind.

The GHC’s existing eventlog is pretty helpful in terms of a unified interface to get at both user supplied and RTS logging information. But I think there’s a big gap in the sorts of user supplied messages it can emit.

You can basically emit arbitrary strings, or arbitrary binary data. The issue with these is that as far as the eventlog is concerned these are completely unstructured. It’s normal for users to encode more structure but generic eventlog libraries don’t know about it, eg, GitHub - ethercrow/opentelemetry-haskell: The OpenTelemetry Haskell Client https://opentelemetry.io embeds opentelemetry data.

I think it would be good to add some more events to the eventlog that allow emitting structured events/traces largely modelled upon the openetelemetry data model.

This gets us a bit closer to what you are proposing but there is still a gap. The main way to consume the eventlog is to either write it to a file or pipe it to another process. But we’d ideally want to process this data as part of the main process on a Haskell Thread to make things user-friendly. I think something like this should be doable, since the eventlog interface has a configurable way to decide how to process this data, which GitHub - well-typed/ghc-eventlog-socket: Pipe the GHC eventlog stream to a UNIX domain socket uses to send it to a unix socket for instance.

So basically what I’m proposing is rather than adding a new logging system to GHC, we re-use the eventlog and we extend it in ways that allow us to achieve the types of things you mention. That gives us the advantage of having one unified logging system for both user and low-level logging (and also gives an impetuous to clean up or finish off a lot of the really neat experiments in this area).

5 Likes

My 2c, but I don’t think this should be part of GHC at all. You mention Erlang, but how many languages provide logging at the runtime level? There might be common logging interfaces like log4j in Java, or libraries in the std like log in Golang, but I’ve never heard of a language where the runtime contains such primitives. At the same time, your reasons don’t sound compelling to me (I share the opinion of other users who already replied to the post). Lastly, if it were to be added I cannot fathom the amount of bike-shedding it would involve; just look at how many comments are on adding a todo function to the Prelude: Add the `todo` function · Issue #260 · haskell/core-libraries-committee · GitHub

Though I agree that GHC and co. should make it easier to analyze runtime information (last time I tried to use ThreadScope I could not get it to start at all), but this is completely different from application business logic logging.

1 Like

Yes after some thinking, I’m not wedded to it being wired into the RTS. It can live in userland. What do you think about the rest of the proposal?

2 Likes

Well, it would be nice if there were a “blessed” logging library but I find it quite hard for that to happen since it would need to reach MTL/transformers levels of adoption. For example, lens is not ubiquitous despite being one of the nicest solutions to the Records problem.

My suggestions would be to take an existing logging library and contribute as much as possible to make it the “default” solution by improving docs, compatibility with multiple sinks, runtime control, etc.

5 Likes

just popping in to say that GHC already has a pretty-printing-plus-IO library which could be extended into a logging one

https://hackage.haskell.org/package/ghc-9.8.2/docs/GHC-Utils-Outputable.html

1 Like

Oh but that’s entirely something else :smiley: Outputable is something for the compiler, and incidentally exposed because the GHC API exposes everything. Users of base or ghc-prim or ghc-experimental don’t have access to Outputable because this goes away after compilation.

What I miss from Python/Rust is a way to configure the individual component logging’s verbosity. Should this proposal define a common configuration format? For example using such a prefix: env HASKELL_LOG="warp=debug,http-client=warning"

ah I see, TIL thanks!


@Kleidukos it took me a (rare) moment of clarity to recall that base doesn’t import ghc-the-library.

Yeah I didn’t want to introduce this too quickly but it’s also something the erlang world does. :wink:

I agree it would be exfremely useful to have a single foundational logging framework that all libraries can opt in on and configure. Python does it well in this regard

I’m not sure Erlang have it per-se, from my understanding it’s more a BEAM (Erlang VM) primitive.
GHC RTS and BEAM are quite different, BEAM is designed to be distributed, fault-tolerant, stateful and interactive, it has its own life-cycle and exists to support running “processes”.
Unlike GHC RTS which is embedded, stateless, with few configurations, and mostly abstracted at runtime.
It seems a bit impactful to have RTS configuration for IO.

2 Likes

I can’t really see the point of making this a part of GHC, having one true blessed logging infrastructure would really mean that it has to be a very low level interface, essentially a Builder -> IO () or (Buffer -> IO ()) -> IO () so that it can be generic enough to support all the higher level options. And we already have those functions easily accessible. Imposing anything more opinionated on it, like timestamp formats, priorities, log formats (plain text only? JSON? What if I want CBOR?) just means others have to use the third party libraries anyway.

I would be interested to know what features are believed to be available to a GHC blessed logging library that aren’t available to libraries and apps at the moment? I don’t see logging as a particularly heavyweight library as it is, and we get excellent performance as it is. 99% of the performance wins come from producing efficient types, like Builders, and running the file writing in a separate thread, with messages timestamped in their thread and send over a Chan. I’m sure there’s more optimisations to be had, but I’d like to hear what they are before this would make any sense to me.

2 Likes

The same argument could be made about JSON processing, CLI argument parsing, or any other functionality that can be a library.

It is a great feature of the Haskell ecosystem that such things can be upgraded independent of the compiler version.

Using libraries should be easy. aeson is extremely common for JSON and managed to gain adoption without being put into base. streaming-commons is a popular shared-fundamentals library used by various streaming packages of different authors (e.g. bridging the conduit and pipes ecosystems).

We don’t even have fundamental types such as sets, tree maps, hash maps, etc. in base, but in libraries instead (containers, unordered-containers, which is good), and those are less opinionated than logging libraries.

So it seems better and easier to make a “logging-commons” library and convince people to use it, than to do the same and additionally get it into base, which has a slower development and evolution process.

Further, structured logging requires at least some form of nested lists and objects, which is why most structured-logging libraries depend on at least aeson or some form of containers. So putting structured logging into a base interface would require putting those dependencies into base as well.

Generally, I think it would be useful to identify what the “shared interface that can unifiy logging libraries” is, before trying to add to GHC a vaguely specified thing.

If GHC is missing concurrency basics, it should add the concurrency basics.

3 Likes