"Core language" for command line arguments

tomjaguarpaw · July 19, 2024, 12:48pm

Following up from GHC and Cabal: the big picture, I hypothesise that part of the problem described is due to poorly-specified APIs[1], particularly the command line API to GHC that enables “packages” (perhaps more precisely, “units” – the ambiguity is symptomatic of the poorly-specified API).

This got me thinking further. Here’s an idea:

A nice way of specifying command line APIs would be to take a subset of all possible command lines for a program, and call it the “core” command line API. It should be as small as possible. Every other valid command line should “desugar” into an instance of the core API. This idea was inspired by the way GHC desugars a very wide range of Haskell syntax into the small Core language. Any ambiguity about what Haskell syntax means can be resolved by inspecting its desugaring.

I was wondering if anyone had any thoughts about this idea of applying “desugar to core” to other APIs, such as command lines.

[1] I interpret “APIs” very broadly, and generally take it to mean “all programmer interfaces” rather than “application programmer interfaces”

jaror · July 19, 2024, 12:59pm

This is just the principle that you should aim for loose coupling in your software.

tomjaguarpaw · July 19, 2024, 1:01pm

Is it? The principle of loose coupling is well-known, of course, but I’ve never heard anyone make the case that a good way to achieve that is to have a “core” API that everything else can desugar into. Maybe everyone else implicitly understood that? If so it has passed me by.

jaror · July 19, 2024, 1:49pm

My intuition is that simpler interfaces are less coupled. It would, for example, be easier to swap out one of the components if the replacement only has to implement a simple interface. A core API aims to find the simplest possible interface, so I’d say that is the same as trying to decrease coupling.

tomjaguarpaw · July 19, 2024, 1:57pm

That’s interesting. I wonder if loose coupling can be characterized by the ability to interpose a “translation layer” between two components (i.e. something that does exactly this desugaring).

DavidB · July 19, 2024, 2:18pm

I think what you propose is close to Git’s distinction between “plumbing” and “porcelain” commands: Git - Plumbing and Porcelain

jackdk · July 19, 2024, 2:31pm

I think that’s the downstream effect but not the goal that produces it. I think the architectural principle here is the search for a narrow waist that enables the loose coupling.

tomjaguarpaw · July 19, 2024, 2:34pm

Ah nice! Great to see this prior art. Thanks.

Thanks, this definition from your link captures what I was trying to get at (or at least a lot of it).

Consider using fewer concepts, data structures, and types in foundational software like programming languages and operating systems.

This style allows for more composition and ad hoc reuse. It scales in multiple dimensions. It evolves gracefully (and messily) over decades.

When introducing a new concept, define a way to reduce it to an existing concept.

So synthesising the separate points I think I’m trying to achieve loose coupling of components that communicate via command line API by making that API a “narrow waist”. The plumbing/porcelain distinction is a specific way of characterising the approach.

janus · July 19, 2024, 3:02pm

How do you iteratively approach this? I worry whether the API is too big/complicated to get it right at first. One way to work iteratively would be to say, “these are the primary features I want to support”. I don’t even know if we have the statistics on usage of different Cabal features. Even if we don’t, I am sure there are opinions on what are the most important features.

For example, let’s say you need a set of special flags to use reliably use pkg-config across the Linux distributions in Tier 1 of GHC support. Would that be something you rank higher than e.g. Backpack support? I suspect that for most users, it would be.

If we could get consensus on prioritized features like this, I’d see that as one approach of getting started on this. But I am not a Cabal dev. I might be thinking excessively with ‘start from scratch’ frame of mind.

In these discussions, I am so surprised to never see MicroCabal or MicroHs mentioned. I suspect that some people just consider it plain unrealistic to start over like scratch like this. But it has one redeeming quality to me: You’re forced to find out which features you wanna prioritize. Judging from the community buzz around MicroHs/MicroCabal, I feel like I am the only person excited about this.

I suppose there is also another school of thought which would be to say: let’s get the fundamentals right first, make sure Backpack is natively supported in a new specification without workarounds. I wonder if that’s kind of approach would have any proponents? Would be surprising because I see so little Backpack usage. For example, I saw a Cabal dev admitting the other day to not knowing the details of Backpack. Which makes total sense to me. Hope I am not hurting anybody by pointing this out.

I am also surprised that you haven’t mentioned hc-pkg and such. Isn’t that a command line interface? Why only specify GHC flags? Specifying the package database is just as essential, and just as hard, right?

hasufell · July 19, 2024, 3:12pm

Yes. This is the (extreme) unix way like git does: you start with low-level primitives and expose them all as programs/commands. Then stitch together higher-level commands (possibly as shell scripts first).

I’m afraid I don’t have enough insights into GHC to suggest something. I have somewhat of an idea about cabal (e.g. one command gathers constraints, drops them to a file, another command runs those through a solver, drops the result to a build plan file, the next command executes that plan, etc. etc.).

That’s why I think that the main benefit is actually approaching the architecture via those primitives and keeping very strict boundaries between the parts. Otherwise you’re just exposing some CLI API that doesn’t reflect the architecture behind it. And then you get brittle boundaries all over.

mpilgrem · July 20, 2024, 12:22am

One of the problems, for a popular tool with a long history, is how do you get from a sub-optimal UI state ‘A’ to an optimal UI state ‘B’ without the transition states being worse than the starting state ‘A’.

For example, Stack’s command line looks like this:

The commands in group 7 all provide information to the Stack user. For me, they are a bit of a mess - but would getting to something more coherent be more painful for users than that situation.