I believe we’re talking about two different things, you seem to have a more project-oriented view of how this could be resolved (which is a vision I agree with), but base
cannot depend on text
nor vector
, yet its APIs are the foundation for learning Haskell, and ultimately if we are going to promote better ways of writing Haskell, base
has to show the example.
I suppose, from that point of view, what I’m thinking is that my proposed stdlib
could gradually come to supplant base
as the ‘foundation for learning Haskell’. Note that GHCi (and maybe GHC too?) already makes a bunch of packages available — this would essentially just formalise that situation and extend it to Cabal.
But yes, I agree that the ideal situation would be for these APIs to all get subsumed into base
. As I said above, I’m hopeful that the plans to split base
might allow this to happen eventually. (I’m not sure what the current status of that is, though.)
This is a point I’m stuck on at the moment. Why does base
have to show that example?
It’s the first real Haskell code people ever see. It’s the first documentation that people ever read. It contains the first programming patterns that people see (like using a go
sub-function). It throws the first errors people have to decipher. It is that fundamental. If we despise or let rot base
, there’s no hope, because it is our foundation, and trying to bypass its problems without fixing them is like treating symptoms instead of causes: not medically advisable for any kind of real effect.
Another point I’d like to bring on the table: OCaml has the problem of having warring standard libraries, and the feedback that was brought to me by teachers and industrial users is always the same: It is a painful and sad situation. I hope we can learn from what happens outside of our community.
…well, that’s the end of the abstract monadic IO
type - it has an implementation but no denotation.
But I digress - from this older thread:
-
this post by me;
There are wishes, and then there are flights of fancy…
Aha, thanks! So now we’re getting closer to uncovering hidden assumptions and achieving a separation of the “good things” we want to see from the means of obtaining them:
Good things:
- allow easy access to
text
,vector
etc. because these are better ways to write typical programs thanString
,[a]
etc., and we want people to have an easy time with Haskell - onboard people to Haskell in a way that quickly familiarises them with the way they typically should be writing Haskell in practice
Ways of achieving the good things:
- Improve
base
, particularly to absorbtext
andvector
, because pretty much all existing codebases and pedagogical material usebase
! Therefore these good things will be available to pretty much every Haskeller. Or, - Perhaps something else.
I strongly agree that 1 looks like the most effective course of action. Still, I would encourage everyone to try to think of examples of 2 (if only to document them for posterity as inferior to 1) and to continue to separate “good things” we want from the way of achieving the good things.
The hidden assumptions push us towards believing that 1 is the only valid solution. The hidden assumptions may be correct! In fact I think they probably are. But let’s please expose those assumptions so that everyone is discussing based on the same principles.
I don’t consider myself a “maintainer” of base
when I am acting as a CLC member. That title rolls off my tongue more easily when I have made a proposal and follow through on it. Instead I think of our role as that of stewardship: looking after it, but not necessarily changing it. If vector
and text
were indeed absorbed into base
, I would not ipso facto feel like my burden has increased, rather it would be the reviewing the work of volunteers (which may include myself, in another hat!) during and after that may require additional attention on my part. I doubt that a software library follows the “square-cube law”, in that a broader code coverage results in significantly larger proposal volume.
Completely agreed with all of this!
In response to your first point, let me list the libraries I’d personally like to see merged into base
, roughly in order from most to least important:
-
text
(becauseString
is terrible) -
bytestring
(because IO is liable to mess up encodings otherwise) -
containers
(because it’s just so generally useful) -
random
(ditto) -
array
,vector
or both (ditto, ditto) -
transformers
andmtl
(because they make up probably the most popular way to structure Haskell programs) - and possibly also
deepseq
,directory
,filepath
,mtl
,parsec
,process
,stm
,template-haskell
,transformers
, depending on how far we want to go
I’d cut after vector
. Transformers and parsec are pushing it too hard, having popular and/or more efficient alternatives.
I don’t know about containers
and vector
, but I would really, really, like the base to stop promoting String = LinkedList Char
Okay, having typed that, I now want stuff like LotsOfStuff = LinkedList (Key, Stuff)
be shooed away too.
(For staging reasons we cannot have TH in base. )
Fair enough — after all, I use megaparsec
myself.
Huh, thanks for enlightening me! For a while I’ve been wondering why something seemingly so fundamental got split off into a separate package.
I monitor hackage-recent feed and generally like to hackage-dive once in a while. And I see a lot of packages, ancient and recent alike, using strings and lists in inappropriate places. It’s a pity that otherwise okay (algorithmically etc.) packages sticking to bad defaults.
Apparently, this is a significant hurdle
Yes, it seems it is. I’d like to understand more about the nature of that difficulty.
and unfortunately, changing base
retrospectively won’t change those packages, it will only make new packages more likely to start with the good defaults (and, I suppose, the maintainers of the old packages are somewhat more likely to switch when they get round to it).
I wonder if it’s warranted to initiate a community effort to promote switching String
to Text
in the most important places in the ecosystem.
You could try sending patches to the maintainers of said packages as polite messages:
-
they’ll either accept them “as is” (or after some editing);
-
alternatively, you’ll receive a polite reply as to why your patch wasn’t used (or wouldn’t work).
In this way:
-
the easier usage of
String
(i.e.[Char]
) are deprecated; -
and there will (presumably) be valid reasons for its ongoing usage, which then provides feedback for the maintainers of
Text
.
Such a effort would also help to show to the wider Haskell community that Text
can now be relied on as a working replacement for the old String
type.
I think this is absolutely a great idea. @atravers’s suggestions would be a good starting point for this.
(For myself: of my maintained Haskell projects, as far as I can recall, all use Text
except for one. That last one in fact uses a type which effectively boils down to [[Char]]
, though now that I think of it there’s a chance [Text]
might be better. The inner [Char]
there is usually only one character long, so I have no idea if Text
would be better or worse for that use-case; in any case the difference seems small enough that switching it over never felt quite worth the effort, though I might still do it some day.)
I think this has been apparent for quite some time to anyone who knows about Text
…
Hrm:
…according to the current documentation, Text
is specifically for Unicode text (with ByteString
just being a compact block o’ bytes).
As I understand it(!), ByteString
predates Text
. Out of annoyance with [Char]
, many maintainers switched to ByteString
(presumably @tomjaguarpaw is one such maintainer, hence that suggestion). But to maintainers of packages still relying on [Char]
, that each other’s suggestions weren’t about the same type isn’t exactly reassuring them that switching to either ByteString
or Text
is a good long-term investment in coding time.
Can everyone here all agree on which type is more suitable for replacing the remaining uses of [Char]
? It’ll make those patches easier to prepare if people aren’t continually switching between those two types…
Yeah, I assumed that was a mere typo. At least to me, Text
seems like the obvious replacement for [Char]
, since both are Unicode-aware: if anything, ByteString
would surely be a replacement for [Byte]
!