The evolution of: Decoupling base and GHC

Only because I was invited, I tried to understand the ‘Decoupling base and GHC’ thread.

I hesitate to throw more distraction into there, so let me raise the question separately: what is the problem youse are trying to fix? It’s described in such vague terms, and there’s such disparate solutions offered, I’m wondering:

  • Has nobody written down specifics because everybody ‘just knows’ what’s wrong?
  • Or is all the discussion at cross-purposes, because people have different perceptions/they’re trying to fix different problems?

To begin at the beginning:

The CLC is in a difficult position: base is dilapidated, causing harm, yet breaking changes to base that make it better also cause harm.

Is base dilapidated? More dilapidated than it was in 1998 or 2010? What’s changing to cause this dilapidation? I monitor StackOverflow questions. I don’t see any that amount to that sort of complaint.

having the most nimble standard library of them all, and thus the best first impression on new aspiring Haskellers.

New Haskellers will take the standard library as is. Because that’s what corresponds to the Language Report and tutorial materials (except the ‘FTP changes’ rather fouled that up).

Please itemise some specific harms that the current structure of base is causing. By “itemise” I don’t mean motherhood-and-apple-pie claims like “flexibility” or “modularity”. If you mean:

amazing libraries so GHC can be easily “remixed” for all many of research prototypes. [from the ‘evolution’ thread]

That’s likely to adversely impact new Haskellers: we don’t want to even suggest they tinker with standard libraries – indeed we want them to think very carefully before even creating a class or instance. Again itemise some research prototypes that are hampered by the current base structure.

(In similar vein, I’d like to see itemised some specific benefits that got delivered by the ‘FTP changes’ – because I haven’t experienced any. Then build on such examples to motivate what you want here.)

I half-agree with @Bodigrim

I believe that base is quite good; I believe that even stagnating and limping base is not a problem in an ecosystem, allowing for alternative preludes; I believe that GHC must evolve slower.

What’s wrong with the alternative Prelude mechanism? I disagree with the “evolve slower” – except is that referring to GHC the compiler/language innovation or to the (base) libraries, which I don’t count as GHC.

1 Like

I’m not clear from that discussion how much of base is GHC-centric vs Haskell-centric:

would require going as close to fully-parameterised (no type classes) as possible:

  • …and less like Data.List.nub :: Eq a => [a] -> [a]

Although Haskell (the Language Report) is defined without mention of any specific classes or data types, in fact they’re snuck in all over the place:

  • foo 0 = "Zero" – inferred foo :: (Eq a, Num a) => a -> String
    fooF 0.0 = "ZeroF"fooF :: (Eq a, Fractional a) => a -> String
  • do { } – requires Monad a =>
  • [ ] – requires Lists
  • if then else desugars to a case with a Bool discriminant.

It seems like you’re blaming GHC for implementing Haskell.

Bringing overlapping instances and orphans (and even MPTCs) into this discussion makes no sense to me.

  • There are no MPTCs in base(?)
  • Are there classes in base for which people want to write overlapping instances? Why?
  • If you have no overlaps, orphan instances are not problematic.

BTW overlapping instances have never been properly specified. Hugs’ implementation is a lot different to GHC – and IMO a lot more robust against orphans, even with MPTCs:

  • Two overlapping instances must be in a strict substitution sequence.
  • So you can make a (partial) ordering of instances just from the instance decls.
  • Then Hugs builds the ordering tree before looking at any usage sites – as a by-product that’s validating the instances in O(n/2).

Imho dwindling position of haskell due to learning curve does say a lot about base.
At least, two of the biggest complaint about haskell was

  1. String and Text discrepancy
  2. Lack of arrays in base

Idk, maybe these are just superficial symptoms and perhaps really FP is dying for good.

I assume that is O(n) per instance? That would be O(n^2) in total then, right?

Or is Hugs building the whole tree in linear time? That seems impossible… It should at least be O(n log n).

Anyway, I do think the Hugs approach is better than GHC’s approach, but GHC is currently the industry standard.

Agreed. When I first started learning Haskell, I didn’t even notice any of these weird quirks of base. As i progressed, I did think the common wisdom recommendations like "Don’t use String, use Text" were kind of weird - and it does seem a bit weird to me to discover that there are a few other string types too (like ByteString) - mostly I just ignore this stuff and use Text because mostly it seems inconsequential to me.

What does bother me (again, now that I’m not a beginner) is that base/prelude has a lot of unsafe functions that explode if you use them , like array indexing (!!).

Utlimately, what hampered by growth as a beginner was the bad IDE tooling. I initially came from the C# world and I loved me some autocompletion and error squigglees. Not having these made it feel like the language was in some type of pre-release beta stage. I wondered if anyone actually used Haskell seriously and that discouraged me from investing time to learn it.
HLS has really turned things around! I feel like Haskell is a premium language now just on that alone.

What currently hampers my experience and growth is documentation. Hackage feels hard to navigate. I never know what package I’m really looking for, and can’t tell what’s high quality, what’s experimental, and what’s abandoned. Documentation is hard to discover on the packages as well (I’ve started to figure it out but it was tough initially). Purescript’s Pursuit was a lot more intuitive to use, and Elixir’s Hex/Hexdocs may be the best module discovery/documentation site I’ve ever seen. You could tell very quickly (intuitively) which modules were the best to use, and documentation on any module under any serious use had top notch documentation.

Anyways! I’ve diverged a bit from the topic of this thread. I know ghc and base have some issues that are making moving forward difficult, so I don’t want to be dismissive to that fact. I know there were some fixes that had to be made to ghc in order to implement HLS effectively. Seems to me like these are the types of improvements to ghc/base that will yield the greatest amount of benefit concerning increasing our user base

1 Like


…yet another opinion: can you at least provide a URL to support this statement?


…or could it be that it’s just Haskell that’s withering away, as people look to other languages e.g. to avoid all this seemingly-perpetual “discussion” about various aspects of Haskell’s future!

From page 46 of 55 in A History of Haskell: Being Lazy With Class:

At the other end, mainstream languages are adopting more and more declarative constructs: comprehensions, iterators, database query expressions, first-class functions, and more besides.
We expect this trend to continue, driven especially by the goad of parallelism, which punishes unrestricted effects cruelly.

(…not to mention the “rising tide” of hardware-level concurrency.)

…so, as this decision seems to indicate:

An Epic future for SPJ

FP won’t be disappearing any time soon. As for Haskell…that remains to be seen.

1 Like

You can easily search lots of accounts considering haskell as one of the dying languages.

Well, it is likely that they are simply to lure FP programmers. Backlash in those communities are huge against these features - it remains to be seen if the features would ever be adopted as common practice. (Notice that ppl prefer older python)
Also, see emergence of Go with its simpler constructs. Ppl nowadays prefer simpler, closer-to-no-code solutions.

I’ve found this: 7 Programming Languages That Will Die In A Few Years, which is hilarious since it lists C as a dying language. I can only hope Haskell will ever become as popular as C is even today.

Also this: 5 Programming Languages That Are Probably Doomed, which claims Haskell is dying or even already dead because it (I kid you not) has flatlined on being the 19th most popular programming language in existence (although I also don’t buy that statistic; Haskell should be lower).

Also remember that the global software developer population is growing. Estimated at 18.5 million in 2014 it is now estimated to be 26.9 million in 2021. That is an increase of almost 50%. I think it is reasonable to assume that beginners are more likely to start with more approachable languages, so Haskell is naturally doomed to fall in the ratings.

2 Likes

@Abab9579 your comments are unhelpful. Just stop. If you want to discuss whether/why Haskell or FP is dying, start your own thread. Get off this topic.

1 Like

Well, it seemed like you most likely agree with the sentiment. At least you came off as disliking FP style.

@atravers

I see discussion of what to do. (Some of it copied into the ‘Decoupling’ thread, where it had already prompted my ‘why?’ question.) I still don’t see a clear statement of the problem.

(But good grief the CLC seems to be good at talking round in circles!)

At one of your links (please explain) there’s

  • Goal here isn’t actually to split up base

If I’m being dumb, and the motivation is so blinkin’ obvious nobody’s bothering to spell it out, then just humour me, and explain it in terms that would make sense to somebody who only consumes base and never looks inside its structure.

Some want to remove partial functions (like head or (!!)). Well ok, propose that, but it doesn’t need splitting anything. If that’s supported (I doubt it), people will just write their own – and roll-your-own is what other comments are trying to avoid.

Remove foldl? And do what instead? Again this doesn’t need splitting anything. If the deeper message is that Lists are almost always the wrong data structure, I’d agree. But blame all the intro texts that go through every durned function over lists, and that generate so many q’s on StackOverflow.

People want different Preludes. (Except some of the alternatives they advocate have bitrotted already.) Then is the problem that it’s really difficult to build alternative Preludes, and having built them they get broken by the next release of GHC? Is there a pattern to the breakage across all these different alternatives? Or is there a separate/partial cause in each case? Then no specific new structure for base is going to help everybody(?)

Let me suggest what the I think the decoupling issue is about (and gently remind abab and atravers that there was a long argument on haskell’s popularity at The evolution of GHC and encourage them to keep any further comments there)

  1. Updates to base can cause people to need to update other code.
  2. People may want to update ghc to get bugfixes, new features, etc.
  3. In so doing, they may not want to update base, so as to avoid needing to update other code, including a potential nest of transitive dependencies.
  4. If base was decoupled so older versions could be installed on newer GHCs, this would alleviate this situation somewhat.

That’s it.

5 Likes

n here is the number of instances declared for this class. If instance j is more specific than instance i, Hugs inserts j in the tree in front of i and it’s done/no need to search the rest of the tree. If j is apart from every other instance (no overlap), it goes as a leaf of the tree, that needs (j - 1) comparisons to get there. (That’s a fairly crude algorithm, but simple to program.) So it’s only the very last instance that needs (n - 1) comparisons.

It could be made more efficient, if that really becomes an issue. (I guess it might for large classes like Eq.) Organise the tree as a BST-lattice hybrid sorted by the constructor of the first (probably only) type param. OVERLAPPING instances point to their most-specific OVERLAPPABLE. (That’s overkill for Eq, because I don’t think anybody wants overlapping instances in there. Then declare in the class whether overlapping allowed.)

1 Like

Thanks, but this is exactly what nobody’s explaining. Let me try to spell it out. Tell me what I’m getting wrong.

“Well-typed programs can’t go wrong.” So if a new base has the same functions/methods with the same type, why “need to update other code”?

  • If base takes away a function/method, yes code will break.
  • If base merely moves/restructures existing functions/methods, who’ll notice? (But equally why should base bother if nobody’ll notice?)
  • If base adds a new class/function/method/datatype, yes that could clash with something of the same name in client code. (But I don’t see adding anything proposed in the threads.)
  • If base adds new instances for existing classes, yes that’ll clash with client code that had rolled their own. (This was one of the complaints with the ‘FTP changes’: not only were there new instances, they also had surprising semantics. In some cases I for one wanted those instances not declared at all, because no sensible semantics was possible.)
  • If base changes the signature of existing methods (for example making a free-standing function on Lists into a method of Foldable), this’ll lead to subtle breakages. This was the bulk of the complaints with the ‘FTP changes’. In theory List is an instance of Foldable so the change should have been invisible. In practice people had to add Foldable constraints all over their signatures.
  • If base imports some module/class/instances that were previously outside base, previously needing explicit import in client code to access them, the new base just piggy-backs on the original import. Or is the problem that client code wants to continue to use an older version of the import? Why, if its signature hasn’t changed?

Or tell me straight: “Well-typed programs can’t go wrong” is bunkum.

You have listed at least four things that can lead to breakages and cause somebody to need to update their code. So I believe you have answered your own question – because changes to base do cause the api surface (as reflected in exported functions, datatypes, and type signatures) to change.

1 Like

As to why do we want to change base in the first place:

  1. Obscure things in GHC.* modules force a PVP breaking version number bump even if none of the main commonly-used stuff changed.

And also many different opinions:

  1. Too much Int where negative numbers make no sense (e.g. lengths of things)
  2. Classes like Num with no clear laws.
  3. Partiality, especially in methods like *1 in Foldable and with Enum
  4. Anything steering towards unsafeInterleaveIO
  5. String stuff is too accessible
  6. Stuff throwing synchronous exceptions is too accessible

These are opinions — you need not agree with them. But at the very least, even if we only add new features to the “main” parts of base and do nothing breaking, we still have the problems @sclv mentioned because GHC might have breaking changes, and people ought to be able to get new non-breaking base changes without GHC changes.

2 Likes

OK so counting from after the ‘FTP changes’ brouhaha, which of those breaking changes has base perpetrated? And how would a reorganised base have avoided the breakages, or at least minimised the impact?

@Ericson2314 seems to be throwing in the kitchen sink. To pick one point as an example:

  1. Classes like Num with no clear laws.

Has base made breaking changes to Num? Indeed has base made any visible changes to Num? I rather thought Num is the same now as Haskell 98. Then don’t mix up complaints about breaking changes with complaints about (what turned out to be with a great deal of hindsight) poor design. In early 1990’s when the Prelude was developed, who’d have expected typeclasses would be so wildly powerful, and would have connected to Category Theory?

As I mentioned, Num (and Integral, Fractional) are pretty much baked into the definition and syntax of Haskell. Nothing’s stopping you creating a whole bunch of other Numerical classes and operators with all the nice properties.

  1. Partiality, especially in methods like *1 in Foldable and with Enum

Has base changed methods/functions to be more partial? Again don’t mix up complaints about breakages with complaints about a design you don’t like. Again there’s nothing stopping you creating safeHead, etc. (I’m not going to defend design choices in Foldable; but any more changes to it had better have an enormous benefit/cost ratio.)

A usage of head or (!!) is not necessarily unsafe: it may be surrounded by checks to avoid calling it unsafely. Yes it’s unfortunate those aren’t type-safe checks, in a language which vaunts the benefits of type safety. “Well-typed programs can’t go wrong” is bunkum.

I think a better approach is through education: stop teaching newbies so much about Lists (including String) and so little about appropriate datatype design/including especially other off-the-shelf recursive data structures.

So is it this (below) what you want? And is this the opinion of you all:

  • Reorganise base so a program can (for example) exclude Num and all its Pomps; then
  • instead import theoretically-pure ShinyNum using all the same class and operator ids; and
  • otherwise use standard Prelude.

How about GHC wired-in modules for implementation stuff like arithmetic on pointers and indexes? Is that also to use ShinyNum? How about modules like Vector with Int indexes; or Data.Set with a size :: Int embedded in every node and (Num.+) to calculate it? Checking for numeric over/underflow or index-out-of-array comes at computational cost. Those modules are already making limited checks, enough to avoid IllMemRefs. ShinyNum will duplicate work for programs that already don’t throw those exceptions. Who’s then responsible for addressing the performance degradation?

@antc2 you have misunderstand my post in the way I feared would happen.

Right now, all these changes to base we can’t even seriously debate because there would be far to much breakage simply because users are stuck with the version of base GHC ships with. With the decoupling, we at least expand the overton window on things the CLC can consider, and I think that is very good ----- even if none of those no longer beyond the pale changes end up being accepted.

Also, see what @atravers linked in Pre-Pre-HFTP: Decoupling base and GHC - #43 by atravers ; there is better motivation there than what I wrote.

Speaking of String, here’s a very good example. I actually have little problem with [Char] being easily obtained from literals — for if beginners need to eliminate strings, this is gentlest way there is. The real problem with String is not that it exists, or even that interfaces that use it exist, but that type classes do ridiculous shit to support it.

For example, the Read class is a disaster:

class Read a where
  readsPrec :: Int   -> ReadS a
  readList :: ReadS [a]
  readPrec :: ReadPrec a
  readListPrec :: ReadPrec [a]

the bottom two are proposed to be replaced…we should have a deprecated cycle to move them outside the class then, just like return = pure should become mandatory. So:

class Read a where
  readPrec :: ReadPrec a
  readListPrec :: ReadPrec [a]

But then readListPrec is a hack for String. If we had

newtype String = String [Char]

that could have the string literate instance, and then we can delete readListPrec

readListPrec and friends are, in my view, overloaded instances laundered as extra methods. Not good!

1 Like

D’oh please stop using apocalyptic language. You’re just undermining your case. (Are there more than two type classes with specific provision for String?) Read has been doing its job happily for 30 years. It’s not broken/don’t fix it/don’t use it as a pretext for breaking all sorts of other stuff.

Show uses exactly the same ruse for [Char] vs arbitrary [a]. So you’ll have to change that at the same time. But there might be good reasons I want to show a [MyType] in a special format and be able to read back in that format. Yes, it’s a cheat to avoid overlapping instances. But since overlapping instances still aren’t blessed, it’s the lesser of two evils.

(The class gives default definitions for the two methods you want to move out, so I guess moving them will be mostly invisible. Unless somebody’s giving some custom overloading somewhere. I’m still not seeing any motivation for tinkering.)

There’s a thread discussing the threshold to clear for a breaking change. That seems to have fizzled out without conclusion. Then isn’t the Read proposal going to go the same way as the moving out (/=) proposal? (Unusually for me – some would say – I didn’t comment/just couldn’t even … I do regret the community uses up so many cycles on such small issues.)