GHC String Interpolation - Prototypes

I announced a survey over in GHC String Interpolation Survey Open!, but I got a lot of good feedback around people wanting to have a tool in hand to play around with and get a feel for the actual (not theoretical) ergonomics of the different options.

So I implemented a prototype of 5 different options here, and it’s extensible, so feel free to clone it, play around with your own proposals, and open PRs for any interesting options.

Spoiler: I kind of like the extensible-hasclass option.

18 Likes

What is the time frame ? I’m quiet busy at the moment but I definitely give it a go in the coming months.

No particular timeframe. Shall we say about a month? That should give ample time to settle

1 Like

implicit-builder

Design with MPTC Interpolate can produce rather weird results when one writes instances polymorphic in string type. And given number of possible stings one would want to write such instances. For example

data Foo a = Foo { foo :: String, bar :: a }

instance (Interpolate String s, Interpolate a s) => Interpolate (Foo a) s where
  interpolate Foo{..} = interpolate ("Foo<"::String)
                     <> interpolate foo
                     <> interpolate ("  "::String)
                     <> interpolate bar
                     <> interpolate (">"::String)

following code will bind 5 parameters when used with SQL queries. It’s probably not something user wanted. Is it good abstraction?

HasClass

I like this design. But Interpolate is tied to string. It’s suboptimal performance-wise. We could instead work with some abstract string with builder:

class (IsString s, Monoid (Builder s)) => IsInterpolatedString s where
  type Builder s = b | b -> s
  toBuilder   :: s -> Builder s
  fromBuilder :: Builder s -> s
  interpolateString  :: String -> Builder s
  interpolateInteger :: Integer -> Builder s
  {- Probably a lot of primitives -}

and Interpolate could have polymorphic return type:

class Interpolate a where
  iterpolate :: IsInterpolatedString s => a -> Builder s

Another question. Do we need extensibility? We have quasiquotes already and they serve basically same niche.

Formatting

And I what are plans for formatting options?

Why would someone write an instance polymorphic on String? Just write it on the specific types you use Foo with in your project

I’m not actually sure String would be noticably less performant, since it’s a flat mconcat. Maybe try running some benchmarks?

EDIT: Also, since it’s extensible, text could define their own interpolator that interpolates with Text.Builder, if performance is a bigger concern

Quasiquoters are really heavyweight; you have to use TH and figure out how to convert a string into an Exp (not sure how discoverable haskell-src-meta or ghc-meta are), both of which aren’t very welcoming to newer people. Unlike, say, macros in Rust or template literals in JS, which beginners easily use.

I’m not a fan of hardcoding a special formatting syntax a la Python f-strings. It feels too special-cased for me. It’s easily done IMO in userland, just use functions

pad :: Int -> Int -> String
pad n x = replicate (n - length (show x)) "0" ++ show x

s"Month: ${pad 2 month}"
3 Likes

That’s very weird question. Because it’s natural thing to do. It certainly beats writing 3 instances for string, text and lazy text. And it’s good smoke test. If polymorphic code starts to behave weirdly and it’s difficult to explain what does it do design in question likely has problems.

EDIT

Idea is about being able to define single instance which works for all string types and isn’t performance liability

It has advantage of being compact. And very common I think most if not all string interpolation implementations have some variant of it. Problem is we don’t have vocabulary of such functions.

I (finally) had a quick try using the source plugin. My findings so far are :

except from explicit which is straight forward, everything else generates pages of errors, which I believe is not acceptable.

I was just trying to rewrite

                 [ "t-shirt#" <> tshow i <> "-" colour
                | i <- [11..14]
                , colour <- ["black"]
                ]

to

                [ s"t-shirt#${tshow i}-${colour}"
                | i <- [11..14]
                , colour <- ["black"]
                ]

The result is a Text and we have tshow = pack . show.

I first got bitten by forgetting to write the s, which luckily was caught by i not being in used (took me a while tto find he problem. I thought initially that the plugin wasn’t called properly".

I tried to “convert” an actual project to see what I would really need and use (this is part of the tests) . I’ll try this week with some SQL queries and maybe some Parser or Brick (things which have a IsString instance).

Other immediate feedbacks,
I don’t like the s (too easy to forget and feel wrong) (I would prefer “”“…”“” or a TH syntax like)
I don’t like the {..} for variable

2 Likes

Thanks for the feedback! Can you elaboarate on the pages of errors? I noticed in one of the issues you reported in the repo is the same error as normal OverloadedStrings behavior. I’m curious about interpolation-specific errors. I’m especially interested in the errors with the extensible-hasclass option, as that’s the option I’m currently leaning towards.

That’s surprising to me, as most languages use braces here. What would you prefer, just parentheses?

The first attempt to replace "t-shirt#" <> tshow i <> "-" colour" with "s"t-shirt#${i}-${colour}", I remove the tshow in purpose to see of the implicite interpolation will work. I get the following error message

/home/max/Exp/test-interpolation/test/Handler/WHStocktakeSpec.hs:34:9: error:
    • Could not deduce (Num a0)
        arising from a type ambiguity check for
        the inferred type for ‘skus’
      from the context: (Num a, Enum a, IsString a2, IsString a3,
                         Interpolate a a2, Interpolate a3 a2)
        bound by the inferred type for ‘skus’:
                   forall {a} {a2} {a3}.
                   (Num a, Enum a, IsString a2, IsString a3, Interpolate a a2,
                    Interpolate a3 a2) =>
                   [a2]
        at /home/max/Exp/test-interpolation/test/Handler/WHStocktakeSpec.hs:(34,9)-(43,21)
      The type variable ‘a0’ is ambiguous
      These potential instances exist:
        instance Num a => Num (Identity a)
          -- Defined in ‘Data.Functor.Identity’
        instance Num Int32 -- Defined in ‘GHC.Int’
        instance Num Int64 -- Defined in ‘GHC.Int’
        ...plus 10 others
        ...plus 47 instances involving out-of-scope types
        (use -fprint-potential-instances to see them all)
    • In the ambiguity check for the inferred type for ‘skus’
      To defer the ambiguity check to use sites, enable AllowAmbiguousTypes
      When checking the inferred type
        skus :: forall {a1} {a2} {a3}.
                (Num a1, Enum a1, IsString a2, IsString a3, Interpolate a1 a2,
                 Interpolate a3 a2) =>
                [a2]
      In the second argument of ‘($)’, namely
        ‘do insertUnique $ Operator "John" "Smith" "Jack" True
            let skus = ... ++ ...
            mapM
              (\ sku
                 -> rawExecute
                      "insert ignore into 0_stock_master (stock_id, long_description) values(?,?)"
                      $ map PersistText [sku, ....])
              skus’
   |
34 |                ,"t-shirt-red"
   |         ^^^^^^^^^^^^^^^^^^^^^...

/home/max/Exp/test-interpolation/test/Handler/WHStocktakeSpec.hs:44:5: error:
    • Ambiguous type variable ‘t0’ arising from a use of ‘mapM’
      prevents the constraint ‘(Traversable t0)’ from being solved.
      Probable fix: use a type annotation to specify what ‘t0’ should be.
      These potential instances exist:
        instance Traversable (Either a) -- Defined in ‘Data.Traversable’
        instance Traversable Identity -- Defined in ‘Data.Traversable’
        instance Traversable Down -- Defined in ‘Data.Traversable’
        ...plus 10 others
        ...plus 193 instances involving out-of-scope types
        (use -fprint-potential-instances to see them all)
    • In a stmt of a 'do' block:
        mapM
          (\ sku
             -> rawExecute
                  "insert ignore into 0_stock_master (stock_id, long_description) values(?,?)"
                  $ map PersistText [sku, ""])
          skus
      In the second argument of ‘($)’, namely
        ‘do insertUnique $ Operator "John" "Smith" "Jack" True
            let skus = ... ++ ...
            mapM
              (\ sku
                 -> rawExecute
                      "insert ignore into 0_stock_master (stock_id, long_description) values(?,?)"
                      $ map PersistText [sku, ....])
              skus’
      In a stmt of a 'do' block:
        runDB
          $ do insertUnique $ Operator "John" "Smith" "Jack" True
               let skus = ... ++ ...
               mapM
                 (\ sku
                    -> rawExecute
                         "insert ignore into 0_stock_master (stock_id, long_description) values(?,?)"
                         $ map PersistText [sku, ....])
                 skus
   |
44 |                              $ map PersistText [sku, ""]
   |     ^^^^
Failed, one module loaded.


Ok there are only two errors but I even didn’t try to read them. It might not be different from normal OverloadedStrings related error, but in this case you have to deal with 3 types , the interpolated strings itself and the 2 “fragment”.
Also it is not clear how to force type , can I do "s{i :: Int} or "s{@Int i}" ?

Ala PHP, ie s"name= $name" shortcut for s"name=${name}". Allows bracket for expression as in s"name = ${capitalize name}"
It might be nitpicking but we are doing syntaxic sugar there and s"name =${name}" is not much better than "name ="<>name, compare the three

s"name=${name}"
"name="<>name
s"name=$name"

s"name=${name}"
“name=”<>name
s"name=$name"

To be fair, compare:

s"Hi ${name}, does ${otherPerson} know how ${function1} and ${function2} work?"
"Hi " <> name <> ", does " <> otherPerson <> " know how " <> function1 <> " and " <> function2 <> " work?"

And with the second, it’s a lot of leading/trailing spaces that are annoying to get right / not forget. AND lots of double quotes to start and close strings.

1 Like

Why add spaces ? You are happy to write ${name} without space but not "<>name<>".

Anyway I am all for string interpolation I just prefer s"Hi $name, does $otherPerson know how $function1 and $function2 work?" when possible.

Why add spaces ?

Are you talking about the spaces around <>?
(the spaces in the strings are there because of English, but I’m assuming that’s not what you meant, but that’s the spaces I was talking about)

The spaces around the operators are just my preference.

"Hi "<>name<>", does "<>otherPerson<>" know how "<>function1<>" and "<>function2<>" work?"

This feels like a worse “interpolation”, TBH.
${ vs. "<>, and } vs. <>")

I prefer the ${..} syntax because it makes it obvious where the variables are. (FYI, I also prefer ${A} over $A in bash scripts for example.)

I agree that about the english spaces that

in

"Hi " <> name <> ", does " <> otherPerson <> " know how " <> function1 <> " and " <> function2 <> " work?"

but not in

s"Hi ${name}, does ${otherPerson} know how ${function1} and ${function2} work?"

Thus, try

"Hi "<>name<>", does "<>otherPerson<>" know how "<>function1<>" and "<>function2<>" work?"

which is I think solves the space noise equally. If your preference is to have spaces around operator and thus more space noise, then fine. That is your preference.

Bash gives you the choice, I prefer to have this choice.

1 Like

Sure, I can add $name. I liked there being just one way to interpolate, and not being forced to escape every dollar sign in an interpolated string (only if there’s a brace afterwards), but it’s not a big deal.

Yes, you can put any expression in the bracket, so s"${x :: Int}".

Can you show your original + new code? It’s not clear to me how string interpolation causes the ambiguous Traversable error for mapM. It’s also not clear how you have an ambiguous Num constraint; your old code should also be ambiguous if you did show x

It’s not clear to me neither, which is why I stopped reading the error message :slight_smile:

The code is

  runDB $ do
    insertUnique $ Operator "John" "Smith" "Jack" True
    -- Variations need to exist in FrontAccounting to be valid
    let skus = ["t-shirt-black"
               ,"t-shirt-red"
               ,"t-shirtA-black"
               ,"t-shirtB-black"
               ,"t-shirtC-black"
               ,"t-shirtD-black"
               ] ++ [ s"t-shirt#${i}-${colour}"
                    | i <- [11..14]
                    , colour <- ["black"]
                    ]
    mapM (\sku -> rawExecute "insert ignore into 0_stock_master (stock_id, long_description) values(?,?)" 
                             $ map PersistText [sku, ""]
         ) skus

The traversable constraints comes from mapM.
i could be any Num indeed and colour any IsString indeed.
But in the original code "t-shirt#" <> tshow i <> "-" <> colour, colour is constrained to be of the same type as the resulting “stringy” (in our case Text). This is not the case anymore since we allow colour to be interpolated.

I think it is just the way of the compiler to tell us that it lost the plot and doesn’t know the type of skus even though for us it is obvious that it should be a list of something. mapM is not monomorphic (anymore) and accepts any Traversable. Yet this is a simple example of how simple errors can quickly grow out of control and makes string interpolation user’s life miserable instead of enhancing it :wink:

It’s still not clear why Traversable is an issue. skus is very clearly a list, unless you have OverloadedLists enabled? You should also see an issue without string interpolation.

Num should also have happened before string interpolation; tshow i is ambiguous if i is a polymorphic Num a => a

The only ambiguity I agree with here is that colour was previously constrained and is now ambiguous. I don’t personally see an issue with it, and don’t think it’d be that surprising if you were writing it as an interpolated string from the beginning (instead of rewriting a manually concatenated expression). But I acknowledge the wart there

1 Like

No OverloadedList enabled. The difference I can see with tshow is tshow returns a known type (Text) which might then allows defaulting or Num to Int whereas interpolate can (in theory) returns anything : there are two types to guess (input and output) instead of one. This might prevent defaulting.

Also, at the moment there is no instance of Interpolate Text Int.
I am sure in the future this would be solve for Int but the same problem may arise with custom type. The problem being that the lack of interpolate instance generates a totally different message.

Maybe interpolate should return a String and expand the code to fromString (interpolate ...) instead of just interpolate. Optimal implementation toward let’s say Text might be done using rewrite rules (of fromString . interpolate). Just an idea.

This is very cool! Thanks for putting this together. I also like extensible-hasclass, but extensible-th is strictly more general. Specifically, you can define extensibleTHFromExtensibleHasClass to get the behaviour of extensible-hasclass in terms of extensible-th. Here’s an example using your extensible-hasclass version of sql in terms of extensible-th.

By contrast, there are things that extensible-th can do that extensible-hasclass cannot, so I think the former should be preferred. In fact, I suspect that extensnible-th generalizes all the others.

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TemplateHaskell #-}

module MyProject where

import Data.String (IsString (..))
import Data.String.Syntax.ExtensibleHasClass (HasClass (HasClass))
import Data.String.Syntax.ExtensibleTH (Interpolate (..))
import Data.Text (Text)
import qualified Data.Text as Text
import Language.Haskell.TH

extensibleTHFromExtensibleHasClass ::
  -- | @([Either String (HasClass c)] -> a)@
  Q Exp ->
  [Either String (Q Exp)] ->
  Q Exp
extensibleTHFromExtensibleHasClass k parts =
  [|$k $(listE (map go parts))|]
  where
    go = \case
      Left str -> [|Left str|]
      Right e -> [|Right (HasClass $e)|]

sqlImpl :: [Either String (HasClass ToSqlValue)] -> SqlQuery
sqlImpl = mconcat . map go
  where
    go :: Either String (HasClass ToSqlValue) -> SqlQuery
    go = \case
      Left s -> SqlQuery s []
      Right (HasClass v) -> SqlQuery "?" [toSqlValue v]

sql :: [Either String (Q Exp)] -> Q Exp
sql = extensibleTHFromExtensibleHasClass [|sqlImpl|]

Using it gives the expected results:

Summary
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE RecordWildCards #-}

module MyProjectUse where

import Data.String (IsString (..))
import Data.String.Syntax.ExtensibleTH (Interpolate)
import Data.Text (Text)
import MyProject

data Person = Person {name :: String, age :: Int}

instance Interpolate Person where
  interpolate Person {..} = s"Person<${name}, ${age}>"

main :: IO ()
main = do
  let name = "Alice" :: String; age = 30 :: Int; person = Person {..}

  print (s"Hello ${name}! Your age is ${age}" :: String)
  print (s"You are: ${person}" :: String)

  let name2 = fromString "Alice" :: Text
  print (s"Hello ${name2}! Your age is ${age}" :: Text)

  -- use our own `sql` function defined in Lib
  print sql"SELECT * FROM user WHERE name ILIKE ${name} AND age = ${age}"
ghci> main
"Hello Alice! Your age is 30"
"You are: Person<Alice, 30>"
"Hello Alice! Your age is 30"
SqlQuery {sqlText = "SELECT * FROM user WHERE name ILIKE ? AND age = ?", sqlValues = [SqlString "Alice",SqlInt 30]}
1 Like

Is extensible-th using template haskell under the hood ?
If that is the case it will come with the drawbacks (e.g. stage restriction, compile time) that with already existing TH based solution.

Also, I am normally all for the more powerful solution, but my experience so far converting existing code, (as opposed to create compiling example) is that error messages are not really helpful and it is unfortunately easier to switch back to normal code than trying to make the expanded code compile.

I am worried that common TH drawbacks will create that type of unexpected error.

I like extensible-th in the combination with the HasClass idea as proposed by Tom above. I think that design could be simplified even more by removing HasClass (not tested, though):

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TemplateHaskell #-}

module MyProject where

import Data.String (IsString (..))
import Data.String.Syntax.ExtensibleHasClass (HasClass (HasClass))
import Data.String.Syntax.ExtensibleTH (Interpolate (..))
import Data.Text (Text)
import qualified Data.Text as Text
import Language.Haskell.TH

extensibleTHFromExtensibleHasClass ::
  -- | @(TExp String -> TExp str)@
  (Q Exp -> Q Exp) ->
  -- | @(TExp v -> TExp str)@
  (Q Exp -> Q Exp) ->
  [Either String (Q Exp)] ->
  Q Exp -- ^ TExp (Monoid str => str) -- roughly
extensibleTHFromExtensibleHasClass mkStr mkVal parts =
  [|mconcat $(listE (map go parts))|]
  where
    go = \case
      Left str -> mkStr str
      Right e -> mkVal e

sql :: [Either String (Q Exp)] -> Q Exp
sql = extensibleTHFromExtensibleHasClass (s -> [| SqlQuery $s [] |]) (v -> [| SqlQuery "?" (toSqlValue $v) |])

Re: TH: Any serious issues with TH (impact on compile-times, build parallelism, cross compilation) are likely to be resolved in the mid-term future. The first two will be solved by the new -XExplicitLevelImports proposal which is likely to be accepted soon. IIUC enabling TH for cross compilation is “just” a huge engineering effort that mostly concerns adding support for retargetable package DBs to Cabal etc. (The stage restriction does not appear to be a problem worth fixing in practice, but it could be solved, too.)

3 Likes