Towards a prevalent alternative prelude?

I would jump to a bigger, safer and more beautiful prelude at once. And I would most like it to be called base-5.

* * *

The good note that there are different classes of use cases has already been given and I think it is essential that we spell these classes out. The simplest is the dichotomy of libraries and applications — I am going to consider it presently. Another note to keep in mind is that the purpose of a prelude is twofold: to enable presence and to advertise absence.

Examples.

If I am writing a pure library for munching big data:

  • There is really no point for me in having potentially useful data structures dispersed across many packages. I would rather have all of them at hand. Enable me to use the best tool for the job by default.

    containers kind of does the right thing here, and it should be pushed farther.

  • At the same time, I would rather not have anything to do with IO, Template Haskell and unsafeCoerce in this library, and I would like to be enabled to advertise that. The more features I can advertise not having — the better.

If I am writing an application:

  • I may need streaming, concurrency, error handling, an effect system… Some of these are represented by many nearly indistinguishable choices. I would like to have a starter pack that makes the safest and the most approachable choice for me.

    rio is a kind of prelude that makes steps in this direction, choosing the effect system of ReaderT IO for you, among other things.

  • I would like to advertise the absence of unsafe functions.

    For a bad example, vector is not something I would like to use since it makes no effort to provide a total interface. Or, for that matter, base!

    For a good example, protolude exports only a single unsafe function panic that emits a warning upon being compiled — this means that a combination of protolude and -Wall -Werror is a guarantee that only safe primitives are used.

Commons.

This gets us to a category common to everyone: obsolete things that should go away. base has too many of them.

  • Why do we have String in base, when there is bytestring and text?
  • Why do we have lazy IO in base, when it is known to be problematic and there are many good alternatives, such as the conceptually simple io-streams?
  • Why do we have ports of C libraries in base, such as printf and getOpt, when we have libraries that do the same in a type safe way?
  • Why do we have unsafe functions like head in base when pattern matching is the «killer feature» of Haskell?

The list goes on.

There should be a process for throwing obsolete things away, lest we drown. There is no hiding behind «blessed» alternative preludes from this. All of us need a base that we can be proud of, that shows the best of Haskell.

10 Likes

We are quite rightfully discussing what we can do in the long term to improve the state of base, but can we also, in parallel, immediately pick the low hanging fruit of marking as deprecated such functiones non grata as head and foldl? If I were dictator of the prelude I would put such functions at the bottom of their respective Haddock pages under the heading “Deprecated functions: do not use”, marked with a {-# DEPRECATED #-} pragma and an explanation of what to replace them with.

15 Likes

My take is that a new alternative to base (I like the idea of calling it std) is warranted, but that the initial focus should be not on a batteries-included re-export of a rich library ecosystem, but on laying the basic foundations for a stable clean design starting with the most basic functions and data types, around which a more complete library ecosystem can emerge over time.

This would be a substantial project, and would not be just a packaging effort. It probably requires funding of at least one full-time developer who’s capable and enthusiastic about seeing this through a larger team would move things along once the core ideas are in place. Funding is of course a challenge, this may not be possible.

To me this means some or all of:

  • Integral types based on new unboxed sizes upcoming in GHC 9.2 (IIRC).
  • A modern UTF8 string type (re-implementation of text)
    • A new user writing “Hello World!” should be working with real strings
  • A built-in octet-string type (similar to bytestring char8)
  • A built-in byte array type (similar to bytestring word8)
  • String, octet-string and binary builders
  • Some basic containers
    • Lists
    • Mutable and immutable arrays
    • Pure mutable containers beyond lists
      • Sequences, with efficient concat, prepend, append, and index
      • IntMaps, Maps, HashMaps and Sets
  • An updated interface to file I/O
    • UTF8 string and raw octet-string filenames
    • Avoid lazy I/O
    • Therefore a well-designed built-in monadic stream type.
    • Therefore reasonably prompt resource cleanup, without async exception leaks
  • Type-safe “printf”
  • A better record interface. CTRex???

One might well look at OCaml, Rust, Go, Java and Python and see which common features of their core libraries are good ideas that can be sensibly carried over into a lazy functional language. What are the core types and operations, and what are extensions that are not part of the standard library.

Yes, in the process various partial functions should be avoided, … and ultimately the standard library could be more rich than the current Prelude, but the initial work is I think to design a coherent minimal starting point.

9 Likes

But can we please call it standard and not std. Or some other actual English word with vowels.

3 Likes

Before developing a new base or Prelude replacement, I want to remind you that one of the Haskell Foundation goals is to support existing entities within the Haskell ecosystem, and not create new ones. So, if the new standard library is going to be created instead of helping one of the existing ones, it should be completely clear why this decision was made, what are the differences of the entirely new standard library from the existing alternative preludes and why it wasn’t possible to base a new solution on one of the already implemented libraries.

  • closely follows the structure of base (where appropriate) and is relatively conservative with introducing its own abstractions
  • does away with partial functions and known laziness foot-guns (e.g. foldl)
  • re-exports the libraries that nearly every Haskell program uses (e.g. containers, text, bytestring, deepseq, etc.)

That sounds very similar to what relude does. Moreover, relude reexports only boot libraries (except unordered-containers), so it doesn’t contribute much to the dependencies footprint. Though, we still recommend using it only in applications.

However, I have some comment regarding reexporting containers, unordered-containers, text, bytestring and similar. Names of functions in those libraries are very conflicting, so not it’s not really possible to “reexport” those libraries. What would help, is having a language feature like “qualified reexports” that will allow alternative preludes to reexport names under the qualified namespace, and users of an alternative prelude could just write Map.lookup or HashSet.member or ByteString.concat without the need to add dependencies to .cabal and import new modules. There were several proposals about this feature to GHC in the past, but they are not implemented yet.

Alternatively, Backpack can be used for containers or string signatures to solve the name-conflicting problem, but this solution has other ecosysteam drawbacks.

14 Likes

My biggest problem is base being shipped with ghc, which makes for a good deal of dependency-hell pain.
Concerning the content, I think many alternative preludes are rightly focusing on totality, efficient and safer defaults, which is great.

1 Like

I agree with these reservations. For non-serious projects,I just want things to work and want to be able to access others’ projects without a lot of configuration fuss. It’s bad enough now writing stuff to be compatible with different permutations of Monoid/Semigroup for older compilers.

For more serious use, those who need them can pull in alternative preludes if they think that the advantages warrant it.

Yes, this is the right question. My answer would ultimately refer back to the fact that base is versioned together with GHC. This places the user in a situation where they must choose between a new compiler (which they might require to support a new platform, for instance) and interface stability.

4 Likes

My take is that a new alternative to base (I like the idea of calling it std) is warranted, but that the initial focus should be not on a batteries-included re-export of a rich library ecosystem, but on laying the basic foundations for a stable clean design starting with the most basic functions and data types, around which a more complete library ecosystem can emerge over time.

I’ll admit that I am quite skeptical of this approach. The spirit of my original proposal was intended to move the ecosystem in a direction that would allow further incremental improvement of what we already have while easing the user on-boarding experience. With respect to this goal, throwing away the core library ecosystem would be counter-productive, only serving to further fragment matters and creating a significant Python-3 effect which I do not believe we can afford.

Moreover, it is not clear to me that there is anything fundamentally wrong with the status quo. Yes, text and bytestring's interfaces could be more consistent. Perhaps text should use a UTF-8 internal representation. Maybe we provide too many partial functions. It would be great if bytestring didn’t insist on pinned allocations. However, none of these issues require a fresh start to fix. The overall “shape” of the libraries are, to my eye, generally correct.

In my opinion, there would need to be an extraordinarily good reason to throw away any of the libraries by core libraries. foundation has already tried the “throw away the world” approach and, while it has some good ideas, it has not caught on. The reason is clear: network effects are real and consequently the value of a collection of compatible libraries is far more than the sum of its parts.

5 Likes

Can we figure this out before we get into a new prelude?

It seems the root of one problem is the fact that base has a personality complex and does too many things with conflicting requirements.

I would propose cleaning this situation up in some fashion.

For example, ghc could have it’s own base/core lib that follows ghc closely as needed, but a separate base/standard lib that is decoupled from GHC and which can move more freely.

It seems any discussion about re-exports for ergonomics or a whole new prelude is bikeshedding that should wait or be in a separate effort. The most critical updates should be prioritized, with other updates/changes coming separately and afterwards, right?

So what are the most critical updates?
Can we follow some reasonable deprecation process and go through with those updates?

4 Likes

@ChShersh says

Before developing a new base or Prelude replacement, I want to remind you that one of the Haskell Foundation goals is to support existing entities within the Haskell ecosystem, and not create new ones. So, if the new standard library is going to be created instead of helping one of the existing ones, it should be completely clear why this decision was made, what are the differences of the entirely new standard library from the existing alternative preludes and why it wasn’t possible to base a new solution on one of the already implemented libraries.

I completely agree. The construction of a new alternative standard library is not something to be taken on lightly. relude does look quite thoughtfully composed and extremely well-documented.

That being said, my largest reservation about relude's particular approach is that it introduces its own abstractions and hides a good amount of functionality of the libraries on which it is based. For instance, only very little of the functionality of Map is exposed and this is done via a relude-defined typeclass. If I want, for instance, minView, I need to reach outside of the standard library into containers.

4 Likes

In an ecosystem where there are options, some of which are more “generally correct” than the other options, the defaults should be generally correct.

When a beginner reads a tutorial, they should be told to use the “correct” variant, but they are not. Most of us learn to use “bad” variants and then have to get lucky and find out that there’s a better way. Working in industry, you will often realize you have been using a function or datatype or method for doing something that is incorrect and which you should replace with another variant that already exists.

Yes, there is something fundamentally wrong with this status quo.

It’s not limited to Text and ByteString and String - can we make an authoritative list that we could base a proposal on?

4 Likes

It seems the root of one problem is the fact that base has a personality complex and does too many things with conflicting requirements.

In hindsight the framing I used in the start of this thread could have been better. By “alternative prelude” I really mean “standard library that is not tied to GHC and provides a few more batteries”.

One way to accomplish this would be to:

  1. rename the current base library to ghc-base
  2. provide new library called base, versioned and released independently of GHC which re-exports most of what base currently exports minus the truly “internal” modules of the GHC.* namespace. Each major release of base would be frozen modulo changes necessary to preserve compatibility with new GHC releases. Compatibility would be guaranteed with a reasonably generous window of GHC releases (three or five releases, perhaps?)
  3. perhaps eventually further split up ghc-base following the model of @nomeata’s split-base proposal

Of course, there is a somewhat major hole in this plan: typeclasses. The basic typeclasses that we all rely on (e.g. Functor, Enum, Generic) would all need to live in ghc-base (in part because they have some special support in GHC and in part because otherwise you end up with instance-hell). This poses a challenge for proposals that want to change typeclass methods (e.g. removing a typeclass method, à la Monad of No return) or the typeclass hierarchy (e.g. adding a superclass à la Applicative-Monad Proposal) since these changes also require source-incompatible changes in downstream users.

This is a major problem that was discussed numerous times in the Foldable-Traversable, Applicative-Monad, and Semigroup-Monoid proposals. In short, there are a few ideas that may help mitigate much of the damage, but I don’t recall any approach that eliminates this source of breakage entirely.

Despite the typeclass problem, I suspect that this idea of versioning base independently of GHC would still pull its weight.

8 Likes

There are ≃ 30 years of teaching materials, tutorials, papers, wiki articles, etc. — all widely disseminated on the internet — that use the String and other out of fashion types. This mass of written texts will not go away regardless of our intentions Re: base.

It seems to me that the worthy goal of pointing beginners to the «generally correct» option has now become an Herculean task; the choice being to break with the status quo (and render the above works unusable) or keep including (and ultimately teaching) legacy constructs.

In the end most languages picked the latter and when the former was chosen (Python 3), results were not perfect. Not an easy decision.

There are ≃ 30 years of teaching materials, tutorials, papers, wiki articles, etc. — all widely disseminated on the internet — that use the String and other out of fashion types. This mass of written texts will not go away regardless of our intentions Re: base.<

It seems to me that the worthy goal of pointing beginners to the «generally correct» option has now become an Herculean task; the choice being to break with the status quo (and render the above works unusable) or keep including (and ultimately teaching) legacy constructs.

Yea, it is what it is, but that doesn’t mean we shouldn’t fix our base. Let the out-of-date tutorials get updated. Let new materials be written. Let the ecosystem change. That’s why we have deprecation processes.

Also, this is why we need an authoritative haskell developers guide hosted on haskell.org, backed by the community. Just like rust, python, etc.

8 Likes

Thank you for clarifying @bgamari, and thank you for helping to lead this effort.

I support the general idea, and I would like to recommend making the critical changes that are obvious (in need and solution). Re-exports and such can wait while we clean up the quality and correctness of what’s available in that standard base.

Despite the typeclass problem, I suspect that this idea of versioning base independently of GHC would still pull its weight.

What are our next steps for this? How do we make this happen?

2 Likes

I would really like to get more eyes on this idea first. In particular, I know that there are some members of the community who have thought deeply about compatibility and core library evolution (@ekmett and current members of the CLC come to mind) who have not commented. I would very much like to hear their thoughts on this as I am certain there are dimensions to this problem that I have not yet considered.

Beyond that, I suspect the next step is a GHC Proposal.

@ketzacoatl

Also, this is why we need an authoritative haskell developers guide hosted on haskell.org, backed by the community. Just like rust, python, etc.

Yes, and this is something that is being discussed as a potential item of the technical agenda of the Haskell Foundation.

2 Likes

I’d like much clearer technical motivation and goals before even attempting an evaluation of existing solutions.

I personally haven’t seen a single alternative prelude that focuses on correctness, apart from the “let’s avoid partial functions” sentiment, which I consider a not too interesting goal (this can be fixed by documentation, module and function names, done). Things like the abstract filepath proposal [0] haven’t been implemented in most of them [1][2] (foundation tried [3]). File handling is still largely “let’s see what happens” with underdocumented exception behavior and tricky cross-platform quirks. Lazy IO is still all over the place. Encoding awareness is often pretty low.

So, what are the goals? Beginner friendliness, avoiding partial functions, convenient re-exports, being minimal or batteries-included, correctness?

Whatever the goals are, I think we should be able to agree on one: prelude should set a definite standard on how good documentation looks like. This [4] is not it.


Note: discourse is broken and only allows new users to post two links

[0] https://gitlab.haskell.org/ghc/ghc/-/wikis/proposal/abstract-file-path
[1] https://hackage.haskell.org/package/rio-0.1.20.0/docs/RIO-FilePath.html#t:FilePath
[2] ttps://hackage.haskell.org/package/relude-0.7.0.0/docs/Relude-Base.html#t:FilePath
[3] ttps://hackage.haskell.org/package/foundation-0.0.25/docs/Foundation-VFS-FilePath.html#t:FilePath
[4] ttps://hackage.haskell.org/package/base-4.14.1.0/docs/Prelude.html#v:readFile

2 Likes

I guess the underlying motivation for a major break is that otherwise it is hard to see how a new UTF8 lazy text becomes the real String, so that filenames are UTF8 strings, show returns UTF8 strings, …, lazy I/O is deëmphasised and [] is a lazy iterator, that isn’t nearly as often abused as a catch-all container.

Yes, shades of Python3, but Haskell is not an interpreted language and module dependencies are versioned, whether compiled statically or loaded as shared libraries (the shared objects have hashed names). So a major compatibility break would not introduce nearly as much deadlock, but it would take O(decade) for the install base to switch to the new way.

Perhaps it is too late to start, but it’ll never get easier…

2 Likes