The Quest to Completely Eradicate `String` Awkwardness

darkxero · August 7, 2024, 1:44am

Text is the better solution when you just want to shove bytes you can read and understand, somewhere (the web, a file). Which is 99% most of the time. So yeah, you want String to be Text and you want it to work like Java (it replaces String with StringBuilder when it can, do you really want to iterate all over each previous String to concat another when you just want a lone big String at the end?). This is very good because you help people fall on the pit of success.

Now, the current value of String is educational. It’s a list. List functions, functor functions, monad functions, all apply. You teach students to come up with reverse on a list, and then teach them how to use it. You make them compose the acquired list tools to solve String related problems.

This is the one in base, students can come up with either.

reverse                 :: [a] -> [a]
#if defined(USE_REPORT_PRELUDE)
reverse                 =  foldl (flip (:)) []
#else
reverse l =  rev l []
  where
    rev []     a = a
    rev (x:xs) a = rev xs (x:a)
#endif

A classic problem example and the solution:

isPalindrome :: String -> Bool
isPalindrome ss = ss == reverse ss

Now, if it was:

isPalindrome :: Text -> Bool
isPalindrome ss = ss == T.reverse ss

You would be showing them this:

reverse ::
#if defined(ASSERTS)
  HasCallStack =>
#endif
  Text -> Text
reverse (Text _ _ 0) = empty
reverse t            = reverseNonEmpty t

reverseNonEmpty ::
  Text -> Text
#if defined(PURE_HASKELL)
reverseNonEmpty (Text src off len) = runST $ do
    dest <- A.new len
    _ <- reversePoints src off dest len
    result <- A.unsafeFreeze dest
    pure $ Text result 0 len

reversePoints
    :: A.Array -- ^ Input array
    -> Int -- ^ Input index
    -> A.MArray s -- ^ Output array
    -> Int -- ^ Output index
    -> ST s ()
reversePoints src xx dest yy = go xx yy where
    go !_ y | y <= 0 = pure ()
    go x y =
        let pLen = utf8LengthByLeader (A.unsafeIndex src x)
            -- The next y is also the start of the current point in the output
            yNext = y - pLen
        in do
            A.copyI pLen dest yNext src x
            go (x + pLen) yNext
#else
reverseNonEmpty (Text (A.ByteArray ba) off len) = runST $ do
    marr@(A.MutableByteArray mba) <- A.new len
    unsafeIOToST $ c_reverse mba ba (fromIntegral off) (fromIntegral len)
    brr <- A.unsafeFreeze marr
    return $ Text brr 0 len
#endif

Monads, the ST monad, unsafe operations, Arrays and ByteArrays.
Hardly beginner material and not an inviting first look at Haskell.

Of course you can… not show them and “just trust me bro”. Or make them take the Text API for granted and work with that. Like how IO is represented and done and how the monad of the IO monad is glossed over at first.

Topic		Replies	Views
Bringing Data.Text into `base`: What is the next step?	52	3926	October 20, 2022
Vote on naming of `Show a => a -> Text` function in Data.Text	16	1073	August 9, 2024
GHC String Interpolation - Final Survey Announcements	24	1379	April 26, 2025
Text-2.0 with UTF8 is finally released! Announcements	20	2821	January 27, 2022
The evolution of: Decoupling base and GHC	47	3661	February 17, 2022

The Quest to Completely Eradicate `String` Awkwardness

Related topics