Thank you very much for all of your hard work, Bodigrim and everyone involved.
Congratulations on the release!
Wow!! Good stuff. Just tested it locally on one of my projects and it works
Thank you for your efforts to see this thru!
I don’t have the full picture, so forgive my ignorance on some of this… I am curious, how far down the road does this push us WRT simplifying our String/Text madness in Haskell?
I would not say that String
is a weakness, it is actually a strength. There is no way not to have Char
in base
and there is no way not to have lists, so [Char]
will be always available. It totally makes sense for base
not to introduce more entities than needed and just use [Char]
to represent strings. It’s also a wonderful educational tool: while there are certain pitfalls with readFile :: String -> IO String
, a novice can start writing something useful knowing only about lists and basic polymorphic operations on them. This is as simple as possible.
It is a weakness in a sense of making “bad” things easier than “good”.
You start with [Char]
, then you add some printf
s, read a file or two, show some things… and then you find yourself deeply committed to your training wheels.
Yes, I know about some use cases where [Char]
is okay. But you need to be on a lookout to prevent it from leaking into all the other places.
I’m even fine with the [Char]
type by itself. The madness lies in the String
.
Yea, I would agree with @wiz on this.
In practice, String
and [Char]
are warts that make implementations more complex, more verbose, and less straightforward for practical use of Haskell in production. They also complicate teaching/learning Haskell to/for beginners. To say otherwise is not wrong, but obviously not the conclusion you would arrive at as a new Haskeller uninhibited by experiential baggage from Haskell of yesteryear.
The madness lies in the
String
.
I’d argue the madness lies in the prolific use of String
in the ecosystem, starting with base
and extending to the rest of our “standard library”.
My point is that we cannot remove String
from base
, because Char
and []
would always remain there.
The ecosystem should use String
less indeed. The biggest hurdle here is file API, which is now targeted by AFP proposal, led by @hasufell.
I didn’t immediately know what you meant, so, for others, it is the abstract file path proposal.
Hey so what’s the upgrade path for this for library authors? Should I go around and make text-2.0
the lower bound for all my libs?
Why make it lower bound? text
is a boot library, so GHC < 9.4 will stick to text-1.2
.
The majority of packages do not need any changes to upgrade, just relax an upper bound to text < 2.1
. A template cabal.project
can be found here: Template cabal.project with text-2.0 support · GitHub
And I see aeson
doing this:
, text >=1.2.3.0 && <1.3 || >=2.0 && <2.1
So I’ll follow that pattern. Thank you!
I don’t understand why we can’t (conceptually) remove type String = [Char]
, change all the relevant functions such as putStrLn
to use Text
instead, and make string literals have the type Text
?
This would pretty much solve the string situation afaict (but of course the cost of breakage would be very high).
Am I missing something here?
I think the problem is deeper than just choosing a type. Given that that correct way to interact with a terminal depends on the mode that terminal is in I don’t think there is a unique choice that works properly. For example
% cat /tmp/test.hs && ghc /tmp/test.hs && LC_ALL=C /tmp/test
main = putStrLn "💔"
Loaded package environment from /home/tom/.ghc/x86_64-linux-8.10.7/environments/default
test: <stdout>: commitBuffer: invalid argument (invalid character)
I think the system needs rethinking from the ground up.
I think one easy solution is to just always default to utf8, or does that not solve this problem?
That would require that anyone using a Haskell system must have configured their terminal for UTF8 which seems rather a strong requirement.
It seems, the hardest part of this fix is figuring out the best course of action.
Maybe we should advocate for the HF Technical WG take on this next step?
Given how GHC breaks our world from time to time, and given how “graceful migrations” is a mostly solved problem, I don’t see why we couldn’t figure this out if we try and push hard on it.
Think of how nice it will be to not need to go through all the extra hoops when using Text
!
A number of languages enforce UTF8 these days, so I don’t think it’s cruel to expect it.
Do they even force UTF8 for interacting with the terminal? I’m personally in favour of requiring UTF8 everywhere but I’m not sure if there are other encodings in popular use on interactive terminals.
Yeah, lots of environments have terminals and locales misconfigured. And then there’s a C/C.UTF8 fustercluck.
Personally I’d rather see mojibake once in a while than have some script/build crashed for printing pretty unicode lines.
And lot of windows systems have configured “right” the console using local encoding.