I’ve seen that some people are in favor of having the OverloadedString included in GHC2024.
OverloadedString allows mainly to use double quote syntax ("like this") for Text (from Data.Text or Data.Text.Lazy),but as the name suggests, only override. It doesn’t replace the double quote syntax.
This means that everywhere "I am a text" can be a Text or an old plain String (i.e [Char]), and sometimes (often?) a doubled quoted thing is ambiguous and you need to force the type usingn type signature of type application.
Including OverloadedString in GHC2024 will break old code by making lots of double quoted string ambiguous, the easy workaround being not using GHC2024 at all (which defeat the object).
Wouldn’t be better if one could instead of overloading the String syntax, override it and say, going forward "I am a text" means Text, unless specified otherwise.
I can see a few options, being able to set a default type to strings, so that with (for example) OverloadedStringDefault Text
"text" is of type Text
but "text" :: String is of type String
or introducing a new syntax for overloaded and nonoverloaded strings.
What do you think. Is it worth writing a proposal or am I the only one interested by this ?
It look like enabling the ExtendedDefaultRules (obscure*) extension, might solve the problem.
* I said obscure, because this extension is not listed in the manual in the list of GHC extension but in the GHCi extensions.
More update
ExtendDefaultRules solves the problem in way but creates another one. ExtendedDefaultRules can be set per project (as OverloadedString) but (unless I am wrong) default rules are per file.
The combinaison of OverloadedSting and ExtendDefaultRules per project might result in String being chosen without any warning (even though Text would have been prefered for efficiency).
An example is for example parsing a csv or some JSon and having some code like
t <- get "high/low"
case t of
"low" -> Low
"hight" -> High
Imo the general idea of “you defined this literal here, but I can’t infer what its type is” is completely fine as is. Works the same as f = pure () giving you f :: Applicative f => f () as a type instead of IO ().
The more general way to tackle this issue would be user-defineable type defaults. Sure, you wouldn’t get the magical out-of-the-box-icity, but default IsString Data.Text.Text is short, obvious in what it does and can work exactly like a typeclass instance.
Also there’s a much bigger issue with literals, it’s the fact that they’re clunky. Conversions through runtime functions both mean you can’t throw compilation-time errors (see ByteString) and you can’t optimize anything (fromString ("naïve" :: Text) should just be packing "na\xC3\xAFve"# into a Text). I suppose quasiquoters are the answer, but writing [bs|thing|] would be a lot of overhead for something that seemingly should be supported out of the box.
I, too, would like "foo" : Text. Possibly even un-overloaded. My impression is that to do that properly, to have a nice consistent language overall in the end, we should also Text-ify base and Prelude. It is already hard to explain why we have String and Text and Lazy.Text (plus all the binary variants); I fear it becomes more of a mess as soon as "foo" : Text, but many basic functions still want String.
But such a change to base/Prelude is a huge undertaking, so I’m not very optimistic.
I don’t want Text by default. In fact I like String as they are (using Text because String is slow is actually a premature optimization), and the last thing that I want is a Textified prelude.
The main issue with a Textified Prelude is most functions from Text collides (in name) with List (e.g. length, drop, strip) etc …
The answer to that is the foldable/traverse proposal which is I think is a mistake (mainly because it introduce type classes to resolve name overloading). Moreover FTP is not enough for Text, as Text is a MonoFoldable.
What I want is to be able to use String when I can, and if I decide I need Text, use OverloadedString (in that required file) without having to add type signature everywhere.
Ideally I would have prefer String to be [Char] with Text optimization under the hood, but I understand is not possible.
Not sure what you’re referring to. ExtendedDefaultRules has existed since GHC 6. The ghc-proposals link is just an issue, so it hasnt even been proposed officially yet.
I grant that String v. Text and strict v. lazy decisions are annoying, particularly for newcomers, but I don’t understand why people say this about the “binary variants”. Many languages correctly distinguish strings from arbitrary binary data.
Taking inspiration from QualifiedDo, which allows you to overload the meaning of do-notation on a case-by-case basis, perhaps we could have a QualifiedStrings extension which would allow you to write:
import Data.Text qualified as T
foo :: Text
foo:: T."sometext"
-- perhaps without dot? T"sometext"
The idea is that if SomeModule has a fromQualifiedString function defined, writing SomeModule."sometext" would desugar to fromQualifiedString "sometext".
Sorry, that’s what I meant with the parentheses: Those two are separate for a good, but here are again two variants (plus more with myriads of vectors and arrays). I agree that having separate types for text and bytes is the right thing to do.
My bad, it should have been “Using Text because String is slow is a premature optimisation”.
What I mean is, as an abstraction seeing a string/text as a list of char is not only fine but reall nice.
Having String = [Char] allows you to do lots of things out of the box, iterating over, transforming, choping, filtering etc …
The Text abstraction doesn’t bring anything but disadvantages: name clashes, ambiguous types, back and forth conversion etc …
Strings have the reputation of being slow and inefficient. It is true in theory, but most of the time it is good enough and therefore better than Text. Switching to Text for performance reason is an optimisation.
Doing so before actually encountering performance issue is “premature optimisation”.