Yes thank you. That is true, and simple and better than what I was saying.
There are many things we might like to do, and could do today, but they would take very long and cause must controversy along the way, so we won’t do them.
Yes thank you. That is true, and simple and better than what I was saying.
There are many things we might like to do, and could do today, but they would take very long and cause must controversy along the way, so we won’t do them.
Leaving aside GHCJS (which IMO is an entirely orthogonal issue), Haskell ecosystem is largely stuck with GHC 8.10, because there is no Stackage LTS for GHC 9.0. And there is no LTS, because GHC 9.0.1 was hardly usable. With a recent release of 9.0.2 and Stackage preparing for LTS 19, this difficulty is likely to be resolved pretty soon.
I strongly suspect that funnelling community efforts into faster migration of packages to newer GHCs has better ROI than embarking on decoupling base from GHC (which just delays the key issue). And if we can make GHC to evolve at a pace of a mature compiler and not an academic PoC, it would bring even higher ROI overall.
I entirely agree this decoupling base from GHC is a laudable enterprise, my doubts are just about potential ROI in comparison to other avenues we can choose to pursue (taking into account limited resources of community / HF / GHC developers).
Leaving aside GHCJS (which IMO is an entirely orthogonal issue),
Yes I agree GHCJS is it’s own other issue.
Haskell ecosystem is largely stuck with GHC 8.10, because there is no Stackage LTS for GHC 9.0. And there is no LTS, because GHC 9.0.1 was hardly usable. With a recent release of 9.0.2 and Stackage preparing for LTS 19, this difficulty is likely to be resolved pretty soon.
Sure, but I want base work to never be blocked not matter what GHC is up to, and GHC work to never be blocked no matter what base is up to.
I strongly suspect that funnelling community efforts into faster migration of packages to newer GHCs has better ROI than embarking on decoupling base from GHC.
Well, because of things like GHCJS, that wouldn’t really help me. That GHCJS is orthogonal is sort of my point here — it’s hard to forsee all the reasons one might be stuck on and old GHC or anything (which just delays the key issue).
I think we are just more robust if we make it so base
is never blocked on GHC. Fixing the reasons why GHC would block in the first place is great, don’t get me wrong! But it leaves the underlying fragility where many things can be block many other things in place.
Your “key issue” is to me merely the key issue this time around
I entirely agree this decoupling base from GHC is a laudable enterprise
my doubts are just about potential ROI in comparison to other avenues we can choose to pursue (taking into account limited resources of community / HF / GHC developers).
The initial split in two pieces I don’t think think will be so costly, and the ROI is not what itself is, but that it allows a bunch of stuff today which bogs each other down to proceed in parallel with minimal synchronization. Without this, base
is going to limp along like every other language’s stagnated standard library, because it is just too annoying to work on.
Hopefully @hecate and I will find the time to do it, and so some of the discussion here will be a moot point, but we both have many other things in progress so I am not sure when that would be. So I do think it is good to keep on discussing this stuff if only to hone the reasoning both ways.
And if we can make GHC to evolve at a pace of a mature compiler and not an academic PoC, it would bring even higher ROI overall.
Granted, I am veering into more subjective stuff here, but ultimately I do want them both to able to evolve at a fast rate, base
is quite bad, and the current hodge-podge of extensions we commonly use is also bad. So any plan that relies on GHC just slowing down (because we simply don’t have the reasources yet to move fast and understand how bad the each bit of breakage is) gets me nervous.
I guess we could only agree to disagree about our values. I believe that base
is quite good; I believe that even stagnating and limping base
is not a problem in an ecosystem, allowing for alternative preludes; I believe that GHC must evolve slower.
Fair enough, that is a good articulation of where we disagree.
I’m not following everything here, but there may be one thing that it’s easy to agree on:
As the OP says, maybe we could split ‘base’ into bits that are somehow closely coupled to GHC, and bits that are “pure libary code”. The latter can evolve separately without difficulty.
This is a pretty fuzzy distinction and we’d have to find a way to sharpen it up. But for at least some chunks of base, it might not be so hard.
Would that be a concrete step forward? If we agreed the strategy, a few motivated volunteers might well be able to execute on it.
This would also allow easier experimentation around backends, having a known-size and self-contained set of impure/implementation-specific code.
Another option could be a graphical depiction of the module dependencies in base
, using colour to highlight SCCs (else there’s the good ol’ topological sort with SCC-detection) - the non-SCC modules can then be considered first as candidates for “splitting off”.
Another bonus would be seeing just how tangled base
really is, in order to determine an expedient process of splitting and how many motivated volunteers will be needed (ideally without having to find new ways to motivate them…).
I think yes, that is the initial goal of “decoupling base and GHC”, start small and keep it simple, focusing on separating GHC’s dependencies so we have a cleaner interface to work from.
What’s fuzzy about this and what could be done to sharpen the definition here?
I think it’s also worth pointing out a few resources/discussions that are related, here are a few I’ve found, there are probably more:
The fuzziness is this: what is a sharp criterion that tells you whether a type or function belongs in the “GHC-specific” part or the “pure library” part?
One possiblity is this. Consider an entity E, where “entity” means type, data constructor, class, or function.
I’m not sure if that’s enough, but it’s a start
Yes that is exactly what I was thinking for the first step, “Cleave base in two”.
List
and Maybe
are perfectly innocent definitions, but as they are wired-in they will end up on the GHC-specific side. That’s not ideal, but it’s perfectly fine for the first stab at this.
In response to @AntC2 on another thread:
Me:
If it was just a matter of “
cleavingsplittingtearingslicingcuttingdividingchoppingripping upbase
into two or more pieces”:
- It should have been attempted at least once by now;
- It would have been easier to do back when
base
was smaller e.g. back in 2010.It seems to me that the simplest option is to just move the morass of code in
base
to a package moreghc
-centric, then starting afresh, either in an emptybase
or an all-new package under a different name. This approach provides the luxury of ignorance: you can just start writing new code almost immediately instead of trying to pick a path of least resistance.
Ericson2314:
@atravers yes, base reexports
ghc-base
, and we move over tobase
proper just what is convenient.The only difference is I think @Kleidukos’s heuristic of "just non -
GHC.*
modules is a better first shot at doing the rip, and I emphasize moving code not coppying code.All that, though, just reflects on the first few hours of the attempt :). After that, I think it’s exactly the same.
base
reexportsghc-base
, and we move over tobase
proper just what is convenient.
Alternately, and if the GHC-centric modules are the minority, ghc-base
is initially an empty package which base
imports. GHC-centric modules are then replicated or moved, leaving base
with the the implementation-independent modules.
[…] “only modularity and the flexibility it creates” will save us.
[…] we should have amazing libraries so GHC can be easily “remixed” for all many of research prototypes.
…and right now (2022 Feb) that would require going as close to fully-parameterised (no type classes) as possible:
Data.List.nubBy :: (a -> a -> Bool) -> [a] -> [a]
Data.List.nub :: Eq a => [a] -> [a]
…in the absence of a feasible solution to the problem of orphan instances. This is one reason why I keep suggesting starting afresh: writing fully-parameterised definitions directly seems a better option than trying to refactor overloaded definitions, which means dealing with all the class dependencies “there and then”.
But if anyone else has had experience to the contrary, by all means let us know about it…
As for those who like their definitions overloaded…as previously noted by @Ericson2314, a viable solution to the orphan-instance problem would be very helpful. Having just given Scott Kilpatrick’s approach some more thought (along with a quick skim of section 4 of his thesis), I’m now wondering if that O(n2) complexity can be improved:
It’s another example of the ol’ DRY principle in action - don’t build new worlds with duplicated information; only work with the differences (as much as possible).
We don’t want to do that because we want (the new) base
to be portable across GHC versions (and other hypothetical implementations) at all times. The empty library is trivially portable, and we move over definitions only if they are also.
…and right now (2022 Feb) that would require going as close to fully-parameterised (no type classes) as possible
I am interested in these things, but I don’t think overly opinionated instances in GHC are a problem. (Outputable
is opinionated but structured errors mean that pretty-printing is increasingly moved to the outskirts!)
If you want to talk to about orphan instances more, would you mind forking that to a new thread? It is orthogonal to the low hanging initial steps for both making GHC a real library and decoupling base.
Acknowledged. I’ve just searched for orphan
here to see if there was an existing Discourse thread - this appeared in the results:
…which mentioned this:
which in turn refers to this:
May you be more successful.
Thanks! That is very useful. I am generally a fan of rebasing things no matter how ancient, so I should take a crack at rebasing @nomeata’s changes even though they are a decade old.
From it’s readme
Some changes are just work-arounds due to GHC having the package name
base
hardcoded
Yes we should definitely get that more flexible so we don’t need to rebuild the compiler are stuff merely moves around. Settings file, maybe?
Proposal: Relax instances for Functor combinators; put superclasses on <class>1 to make less-breaking · Issue #10 · haskell/core-libraries-committee · GitHub I wrote up a new benefit that even the most trivial “beachhead” of making ghc-base
contain everything and base
reexport it all would realize.
https://hackage.haskell.org/package/haskell2020
…therefore:
https://hackage.haskell.org/package/haskell-glasgow2021
https://hackage.haskell.org/package/haskell-glasgow2024
https://hackage.haskell.org/package/haskell2028
…
with each being a “standardised snapshot” of:
https://hackage.haskell.org/package/haskell
in which all the latest products of active research debuts. The haskell
package can also serve as the point of separation (or abstraction) between base
and GHC, allowing the two to evolve at their own pace.
Everyone else can then choose the level of stability and compatibility most suitable for their Haskell project.