Text variant for getEnv

Hello,

getEnv returns a String, so as someone who bases my internals on Text, I generally just pack/unpack and carry on… but this time I was wondering, if there was an existing Text variant, but that question quickly morphed into: why isn’t there a getEnv variant that returns Text?

I attempted to answer my own question, but only came up with partial answers that were about other aspects of the String/Text problem.

I also briefly thought about adding my own and sending in a PR, but couldn’t determine whether or not I’d be wasting my time raising the discussion to the core libraries committee.

Feedback is appreciated, thank you!

2 Likes

The getEnv function comes from System.Environment. Since that is in the base package I don’t think it can depend on the text package.

I’ve enjoyed using Envy as a nicer interface around environment variables. If you just want Text support, env-extra looks promising.

4 Likes

Good point. So I read up on that (“why isn’t text in base in haskell”), and I see how that discussion gets no where.

Also, it seems the issue is more about replacing String with a ByteString / Vector variant of some kind (that I’m not familiar with), not Text.

It is unfortunate that the haskell ecosystem cannot figure out how to allow this change to percolate through.

This leads me to a new question: Some threads I’ve read suggest all three/etc types have their place, so if I understand… that is suggesting that String has it’s place. When is it actually appropriate to use String in code you care about?

1 Like

Thank you for the references!

This is a pretty complicated discussion on its own! I’ve written a fair amount of Haskell using both String and Text, but I honestly can’t think of a time where the String type was actually what I wanted! The reasons I used it in practice don’t have anything to do with the type itself:

  1. It’s the default, and I don’t need to depend on a library for it. It’s still hard to depend on libraries outside full-on cabal project (ie for runhaskell-style scripts). That’s a problem on its own!
  2. A lot of existing libraries and APIs using String and don’t always have clear alternatives.
  3. OverloadedStrings needs to be enabled manually can sometimes causes ambiguous type errors in code that was fine without it.

None of these are great reasons! But changing a core type in a language with a lot of existing code is really hard. There’s a delicate balance between having the language stagnate and having too much churn for users and library authors.

3 Likes

The main advantage of String over Text is that String is a simple inductive algebraic data type while Text is opaque. So with Text you always have to use the provided library functions to manipulate it and that can be quite bothersome.

And Text itself is also not ideal for several reasons:

  • Text uses UTF-16 while most of the rest of the world uses UTF-8, so Text is usually less space efficient and you often have to copy it when interacting with the outside world. There is some movement towards text-utf8, but it is going slowly and it is uncertain if that will ever replace the current Text implementation. One of the problematic areas is fusion, this might make single operations slower, but it can eliminate intermediate allocation if you for example map a function over a Text several times.
  • Text is unpinned, which means that you need to copy it to be able to pass it to safe foreign functions. An example of when this limits performance is with the pcre2 library.
  • Text has some overhead which gave rise to the text-short package which claims that Text is not suitable for short text (I personally don’t think the overhead is that large), but I would argue that it is not suitable for long strings either. I would recommend to store larger text in a rope, for example the one defined in yi-rope. Ropes provide much faster functions for combining and splitting text: O(log n) for most operations instead of O(n).
2 Likes