GHC String Interpolation - Final Survey

Final survey for GHC String Interpolation:

This survey is your last chance to submit comments before I finalize the proposal for the committee! I will not be looking at the previous survey results or comments on any of the Discourse threads.

15 Likes

Thank you @brandonchinn178! I see you have briefly mentioned in GitHub - brandonchinn178/string-syntax that implicit-only-string would cost us auto-escaping SQL queries. Would it be wise to still offer it, if that means that it cannot be reliably used for such things? Are there other alternatives that would be an open door for injections?

3 Likes

Do SqlQuery implementations today provide IsString instances? If so, I would either implement implicit-only-string to use a different class, or just say “don’t use interpolation with SQL, as it runs the risk of injection”. I feel like any use of interpolation with SQL is automatically a warning flag anyway.

I won’t change anything now, but we can see what our options are if implicit-only-string is the most popular

1 Like

Do you mean types like Database.PostgreSQL.Simple.Query or something more sophisticated? If the former, the answer is “yes”.

I was thinking more sophisticated, but that example’s sufficient as a counterexample :slight_smile: Yeah, I’ll tackle that later, if the option is voted on.

I’m trying to fill out the survey, but it’s biased and doesn’t allow me to rank all options at 6 (worst).

IsString class is misdesigned and TH is a wart in the ecosystem. All options rely on one of them.

So I want to express the opinion that I like none of the options.

3 Likes

That’s not ranked choice, then :slightly_smiling_face:

Rank the options relative to each other, then vote your disapproval of all of them in the third question.

Out of curiosity, what design would you put forward?

1 Like

A more restricted subset of TH, I guess (I have no idea how!). We definitely need to be able to express failure.

5 Likes

Thanks for your input. :slight_smile:

Thanks for trying! Yes, it would be nice if everyone tried the options live, but probably won’t happen :frowning_face:

I noticed you didn’t respond in the survey, please do so, as you currently will not be included in the feedback :slight_smile:

@brandonchinn178 I have a question: in your opinion, which implementation is the most opaque to end-users, so that it could be swapped in the future without breaking interface?

1 Like

Something to consider is that if you implement the interpolation syntax “a ${b} c” then you might be able to implement the extensible-th variants by quoting an interpolated string.

[| "a ${b} c" |]

I think that the problem of string interpolation should be thought of in terms of a simple multi-stage system. Which to me favours one of the implicit variants. I don’t understand very much what the Builder is for, so I think implicit-no-builder or explicit look best to me.

6 Likes

The ability to handle SQL code (lack of which is sold as a disadvantage of some of the options), is, in my opinion, a good thing. In theory, you could get a builder that would hold enough state to replace this with a placeholder and use proper query params for that. This would however be already tricky (see below), but worse, by hiding the fact that there’s some magic going on and making it look like normal string concatenation, it would encourage using normal string concatenation in other contexts. And that’s a recipe for sql injection. I’d rather not have language design encourage this.

And, for a usecase where getting quoting right is tricky: Backend for displaying a table (web-based, GUI app, doesn’t matter). The table can be filtered on any column. So the resulting SQL could look like
“select * from table where $conditions” with conditions being a list of “col_x = $val_x”. Figure out which one is syntax and which one is argument to be put into query param. Now think that col_x could be a subexpression on it’s own. And you still have to do it on fully generic level, without any knowledge of what application it’ll be used in.

2 Likes

Question: does the explicit version (or any for that matter) allow for arbitrary expressions in the splices? I can imagine doing f"foo ${Text.pack $ show thing} bar".

Stuff like that would make me favor explicit. The type errors with explicit would also be very nice. It would just be a plain “expected Text, got String” kind of stuff.

1 Like

This really isn’t so scary. You make a class instance for interpolating a SqlFragment into a SqlFragment that doesn’t do any escaping or parameterizing, and you make it difficult to construct a SqlFragment without going through the string interpolation interface. Then you do something like:

makeQuery :: (a -> SqlFragment) -> [(a, String)] -> SqlFragment
makeQuery colToSql conds = sql"select * from table where ${condsSql}"
  where
  condsSql :: SqlFragment
  condsSql = fold1 $ intersperse sql"and" $ map (\(c, v) -> sql"${colToSql c} = ${v}") conds

-- In client code, you'd have (or generate with TH/generics):
colToSql :: Column -> SqlFragment
colToSql = \case
  Name -> sql"name"
  Dept -> sql"dept"
  -- ...

(Can be made more complicated to account for columns not all being String-typed, of course, but I didn’t want to clutter up the example.)

Just a reminder to folks that Discourse threads won’t be included in the survey, so please fill out the survey if you haven’t done so. Happy to answer general questions in the thread, though.

@Kleidukos They all seem pretty opaque to me. Is there an option that doesn’t seem opaque to you? The only issue I see is that the extensible variants wouldnt be able to be swapped out if people are writing extensible interpolators. But if we’re only talking about s"..." strings, they should all be pretty opaque, modulo error messages

@mpickering That’s a good point, you could recover extensible-th with quotes, but youd have to do so via quasiquotes, not a native foo"..." syntax. Which you could already do today, so not sure what the benefit would be…
EDIT: Actually, I don’t think this is possible. For the simple case of only interpolating single variable names, sure, but string interpolation quasiquoters today already do that. To allow any arbitrary expression, you’d need to parse it into a Haskell expression in order to anti-quote it in a TH quote. Which doesn’t make it any easier than today

@FPtje Yes, arbitrary expressions can be interpolated, in any option.

@Torinthiel as @rhendric mentioned, the issue of interpolating nested expressions is solvable. For example in javascript: GitHub - andywer/squid: 🦑 Provides SQL tagged template strings and schema definition functions.. I also don’t see it as “encouraging string concatenation”; string interpolation is very much a magic thing that I think most devs are aware of. Look at the popularity of Python’s f strings, which have a completely different language embedded. I think users can see string interpolation = magic, and I don’t see a concern of “oh it’s safe with string interpolation, so it’s safe to concat strings myself”.

1 Like

For reference, did you see the Python t-string developments?

Yeah, I saw the proposal, it seems fine, but probably won’t work for Haskell. t-strings are Python’s workaround to implement Javascript’s template tagged literals, which is what I was aiming for more directly with the extensible-* variants.

Python or Javascript’s implementation won’t translate easily to Haskell because they both allow “any” value.

# python
def foo(template):
    # template.strings = ("Name: ", "!")
    # template.values = (name,)
    pass

foo(t"Name: {name}!")
// javascript
function foo(strs, vals) {
  // strs = ["Name: ", "!"]
  // vals = [name]
}
foo`Name: ${name}!`

For Haskell, you can’t work with “any” value; my approach in extensible-hasclass was to specify HasClass Show to say “this template only works with values with a Show instance”.

foo :: [Either String (HasClass Show)] -> ...
foo vals =
  -- vals = [Left "Name: ", Right (HasClass name), Left "!"]
  _

foo"Name: ${name}!"
1 Like

This may diverge a bit from the survey, and maybe my concern is without cause. I was wondering why we go via “String”. I guess, we want to say “This type has a representation as a Text or ByteString value (and if must be, a String value).” I guess this is what IsString does. To phrase it a bit different: If possible, let us construct a Text or ByteString value, and only if we can not do that (or do not want to do that), let’s construct a String.

You mean, why Either String (HasClass ...) and not Either Text (HasClass ...)? Because for better or worse, String is the fundamental string type. There’s no way to construct a text literal in Haskell, other than fromString a String.

If you look at the repo linked in the survey, we use IsString so that all of the proposals will work with Text and ByteString