The evolution of GHC

Let’s change from work flow to traffic flow: you’re in charge of a multi-lane highway and strange bumps have started to appear. Until you have a solution:

  • do you assume everything is fine, keep all lanes open and leave it to drivers to dodge those bumps;
  • or are you cautious and close down some lanes, including the ones with bumps?

…not stop their work, just slow it down while you’re finding effective ways to fix and improve the workflow. It just makes no sense to keep the current rate of $ACTIVITY contributions when, according to you, the workflow is problematic.

According to this:

…expecting the “rising pressure” to now somehow result in solutions - in this case - also seems less than viable. But if you want to prove me wrong, go for it!

Ok, if you feel that’s the best next step, please find a meaningful way to see that through. (Prove me wrong, go for it!)

I often find myself in this position, and when I can do something about it, I start addressing the bumps in the workflow by 1) asking questions to the relevant stakeholders/contributors, and 2) identifying changes that have the potential to improve those bumps, building consensus on those changes, and implementing those improvements. More often than not, implementing those improvements has an immediate effect to alleviate pressure, and with each iteration, there is more and more pressure lifted.

I think we’re in the middle of that process, and I think the Stability WG is one of the first signs that we’re going in the right direction.

…(heh) anyone for upload limits on Hackage?

I will defer to your judgment on that point. As for the Stability WG, let’s see what’s happened there in six month’s time…

I think the Stability WG is one of the first signs that we’re going in the right direction

This has been a long thread. The Stability WG will be much more effective it if is informed by

  • clear, precise, and specific identification of the problem(s)
  • creative, specific, and actionable suggestions for how we might address these problems.

For example, “world peace” is a desirable goal, but it is too vague to be actionable. In GHC’s case, “don’t break code” is clearly a desirable goal, but we need ways to reconcile that goal with maintaining dynamism. For example, making no further changes whatsoever in the language or libraries would clearly address the beakage issue, but I don’t think anyone is advocating that solution. @ketzacoatl says “I do not think the problem is the rate of change. The problem is with the process used to roll out that change.” That is good to hear,. I wonder if we could brainstorm what a better process would look like, in specific terms?

The more specific our discussion can be, the better it will inform the Stability WG’s thinking.

3 Likes

…as long as we remember that change doesn’t only break programs and libraries - e.g. documentation is also affected:

…along with sites like Hackage and Hoogle. A quick search with programming language evolution tools uncovered this:

https://www.uib.no/en/rg/put/97531/co-evolution-software-languages-and-language-processors

I’ll leave it to smarter/more-experienced than me to evaluate it…

1 Like

Well, in its simplest form, it’ll probably involve some sort of “staging”, or “grouping”, and “delay” for the rollout of those changes. And we’re talking about structuring or organizing the rollout of those changes.

What we’re talking about probably includes deprecation cycles, and methods for “graceful migrations”.

What is a migration? A change from one state to another.

What is a “graceful” migration? A state transformation that minimizes the possible negative impacts of the change in state.

For example, if I am upgrading an application service from version A to version B, and if that upgrade involves an update or change for both an application and a database, there are many ways I can roll out those changes, some of which will minimize negative impacts, and others that don’t. For example, the software may be updated in a way that only supports one version of the database schema at a time, rather than both the current state and the new state… or the schema and application could be updated in a way that ensures the old and new state are supported. That usually means making smaller changes over times, rather than one larger breaking change all at once.

In terms of GHC, base, core libraries, and the package ecosystem, I don’t yet have enough experience with the details to offer specific changes, and I would imagine other contributors have much better ideas ATM than I do, so let’s find ways of collecting those together.

That said, I would guess that we should probably run some thought experiments, for example, using both past and potential future changes as our example. There have been plenty of examples to choose from, and there are many changes planned/WIP that we can use.

In “Pre-Pre-HFTP: Decoupling base and GHC”, @Ericson2314 suggested:

Could this be generalised to the whole Haskell ecosystem?


So the options are:

  • several “small” interruptions, one for each small change;
  • or one “large” interruption as a result of on big breaking change.

Either way, people are being interrupted while attempting to do what interests them, which is probably why they chose to use Haskell to begin with:

  • if they’re interrupted enough times, they’ll start to reconsider that choice.
  • Interrupt them some more, and they’ll chose another language and probably tell others about “their Haskell experience”.

Without some way to mitigate the cost of being interrupted, the “subdivision” of large breaking changes into a series of small ones seems to have little net benefit (if any)…


That goes for me too. As noted by @simonpj this thread is already quite long - a new thread to collect those thoughts and ideas would be a good start, rather than constantly jumping to the end of this one.

That’s not quite what I was describing.

The smaller changes are not each interruptions in their own - the large interruption is a result of making too much change at once, and in a way that introduces the interruption. Conversely, the smaller changes provide a graceful experience by bridging across the gaps so there are not interruptions. Using the database-backed application as an example, both the app and the database schema support two versions of each other in a way such that there is not a version of the app running which does not support the current state of the database, and the database is not updated in a way that breaks one of the two versions that may be running.

1 Like

…which I thought Hackage already enables - the previous version of a package is retained, not deleted, whenever a new version is uploaded. That being the case, users of that package can then remain with the superseded version until such time they’re ready to update their dependencies in order to switch to the latest version.

Again, it could be just a lack of imagination on my part: I’m not quite sure what feature/s Hackage needs to make that process “graceful”…

1 Like

There is something noteworthy about GHCs language extensions. I was never bothered by adding one when it seemed somewhat useful. By no means to I understand the full implications of those (they also differ substantially in theoretical complexity).

The advances towards simple Haskell do make sense to me – thinking of bigger projects for bigger teams --, the concrete idea of going back to Haskell 2010 or Haskell 98 not so much.

Some sort of consensus on a set of language extensions that seems proven at least reduces potential confusion for a newcomer and helps creating some sort of standard.

What’s missing in the picture is a bold design choice: “This is what we think you should be able to do
with our compiler. This is feature is meant for use in production, this is research.”

“Duh”, you’ll say, “no-one can make an informed decision on that!” This is why GHC keeps it flexible in both directions (you can activate any extension, but none is activated by default).

It might be that any bold design decision is better than none, at the moment, even though no-one knows what the “optimal” direction of GHC really is. And really, the language extensions are just one aspect of that.

1 Like

While the broader package ecosystem is relevant, and any solution needs to consider that aspect of our experience, I think the problems and challenges we’ve been discussing primarily start with, and are more relevant to, GHC and base, then core libraries, then the rest of the hackage ecosystem (in that order).

It appears that the sources for both ghc and base are already in Hackage. If that’s the case, then given my previous point regarding Hackage’s retention of earlier package versions, what is stopping us from making full use of that capacity to achieve the graceful migration towards a “decentralised” base?

I think we need the help of an Hackage specialist…

Would this be bold enough?

…that discussion concluded with me suggesting that it probably probably would be easier to make Elm non-strict by default rather than “embedding” it in Haskell.

So here’s another [attempt at a] bold idea: let’s take an existing statically-typed strict language and make it non-strict:

  • if it fails, the products of the effort can be studied and the results used by other projects;
  • but if it succeeds, there’s the potential to replicate the success of Haskell through the formation of a whole new community, as interested users of that existing language exercise their curiosity.

Before Haskell, there used to be a language called Lazy ML - anyone for Lazy Standard ML '97 Non-Strict ML 2027 …?

1 Like

I am not experienced enough to say this with confidence, but AFAICT, the problem is with how we roll out change in the GHC/base source, not the hosting or availability of different versions of those packages.

1 Like

There are other languages/compilers that do not break code and would rather create a new Language instead of breaking existing code. An Example is “Go”.

So I think that goal is absolutely actionable. But that would boil down to maintaining a stable fork of GHC (e.g. 8.10.7) and observe its popularity.

A alternate name e.g. ghc-2021-08.r0 would make it clear that this is a stable fork (snapshot?), and as bugs are found and removed, only the “release number” .rN would be changed.

If this works, the same could be done for the last of the ghc-9 series (upon the arrival of ghc-10); only then would ghc-2021-08 stop receiving bug fixes - that activity switches to the renamed “end-version” of ghc-9.

All going smoothly (!!!), this gives people the choice of:

  • relative stability with the renamed, maintenance-only end-version of the previous release series (currently ghc-2021-08 a.k.a ghc-8.10.7);
  • or the current release series (ghc-9.2.1 or younger if needed: ghc-HEAD), for adding and testing language and other extensions.

Of course, this presumes there are enough Haskellers left to maintain both versions of GHC. As for having enough resources, someone with previous experience in the upkeep of GHC can let all of us know if any of this is actually possible…

1 Like

Of course, this presumes there are enough Haskellers left to maintain both versions of GHC. As for having enough resources, someone with previous experience in the upkeep of GHC can let all of us know if any of this is actually possible…

Yes for me it’s really key to make things more efficient so we can do more with the limited people we do have. And extremely modular standard library implementation with minimal coupling to the compiler is not something other languages/implementations have really nailed yet, and could be a big advantage to allow us to do more with less.

In generally I am trying to make it more efficient for existing contributors, and also entice knew ones as modularity also lowers the barrier to entry, as @doyougnu pointed out in the other thread. These are very nice multipliers. We should pluck just enough low hanging end-user-benefit fruit to where we don’t feel guilty working on them :), and then switch to working on them.

2 Likes

Alright, here’s a wild idea: instead of trying to split up base, just rename that giant pile to reflect its seemingly-intractable entanglement with GHC e.g. ghc-base. Then you can introduce an all-new std-base, learning from the mistakes experience of writing ghc-base.

Over time, more code will appear in std-base which would make code redundant in ghc-base. That old code can then be removed, resulting in a smaller and smaller ghc-base.

Assuming ghc-base is sufficiently reduced, GHC itself can be switched over to std-base. The code in ghc-base not made redundant by the switchover would then be added to GHC’s code base. We can then put ghc-base out of our collective misery…

So no splitting, just a slow process of attrition in the old ghc-base as it is slowly superseded by the new std-base. Instead of trying to carve up the pile that is currently base, people can start writing clean new code for its successor.

3 Likes

That was the first step of the plan in the original thread. The only trick is making std-base work with multiple ghc-base without erring to much on the side of stuff being in ghc-base

2 Likes

I’m so glad we’ve come full-circle and are back at that original topic!