Need a review of linear-typed API

danidiaz · February 3, 2022, 10:28pm

Ah, I went directly to the repo, that explains the confusion

runBuilder :: (Builder %1 -> Builder) -> Text is more restrictive than returning some arbitrary type wrapped in Ur, but it’s simpler and it does look safe.

Bodigrim · February 3, 2022, 10:29pm

Sorry for confusion, I pushed another commit atop of yesterday’s discussion. Thanks for catching the leakage of Builder!

aspiwack · February 4, 2022, 8:07am

As a way to convince yourself that the type of runBuilder is safe (or, at least as safe as the traditional API), you could, instead, have given yourself

newBuilder :: (Builder %1 -> Ur a) %1 -> Ur a
runBuilder' :: Builder %1 -> Ur Text

The you could have defined runBuilder as follows:

runBuilder :: (Builder %1 -> Builder) -> Text
runBuilder f = unur $ newBuilder build
  where
     build :: Builder %1 -> Ur Text
     build = runBuilder' . f

I’ve had a quick look at the implementation, it has a lot of unsafeCoerce-s. It’s difficult to estimate the cost of these (though it’s definitely not trivial because unsafeCoerce between non-linear and linear functions prevent some inlining optimisations currently).

It would be worth considering defining Builder as

data Builder where
  Builder :: Text -> Builder

(notice the non-linear arrow)

This would mean one extra box everywhere (I have to admit that I haven’t yet gotten around to do the worker-wrapper split for unrestricted constructors like this; however, inlining should still remove a bunch of the boxes), in exchange of avoiding a lot of unsafeCoerce-s.

I honestly don’t know which is faster.

Bodigrim · February 6, 2022, 9:46pm

Thanks @aspiwack! GADT definition allows to remove all unsafeCoerce. Performance remains the same however, because worker-wrapper does not seem to kick in. In fact, if I convince GHC do not split functions into workers and wrappers, benchmarks get faster.

aspiwack · February 7, 2022, 8:15am

I’ve got to admit, it’s pretty funny that deactivating worker-wrapper split makes the code faster. It’s probably a coincidence though.

The reason why worker-wrapper split is not available for unrestricted types is simply because there is no unrestricted unboxed tuple (I wrote a bit in the wiki). There is a bit of design to do, and then it’s mostly just a matter of putting in the time in.

Bodigrim · April 11, 2022, 7:09pm

So far so good, my experimental linear Text builder makes blaze-markup benchmarks twice faster. Anyone else to take a look, before it pollutes Hackage forever?

atravers · April 11, 2022, 10:25pm

If the API is “small”, maybe it could be added to an existing package, if there is a suitable one in Hackage.
Otherwise, and if you’ve received little or no comments from other users, put an exact time and date on when you will be adding it permanently to Hackage on a public forum, like here or one of the mailing lists - that way if someone complains later, you can just send them a link to the relevant post.

danidiaz · April 12, 2022, 5:16pm

Perhaps the README could go into a bit more detail about how the library uses linearity to achieve performance.

A question about

(|>) ∷ Buffer ⊸ Text → Buffer

IIUC, this means that you (linearly) supply a Buffer and get a function to which you can supply different Text values, getting a different Buffer each time. For it to be safe, don’t you need to copy the underlying array each time?

Edit: I misread the signature. If you are in a linear context, you can only use the resulting Text → Buffer function once. You can’t use it multiple times with different arguments. That said, the Text argument can be used unrestrictedly inside the function.

Bodigrim · April 13, 2022, 5:39pm

@danidiaz right, you can define

bar :: Buffer -> Buffer
bar buf = (\f -> f "foo" >< f "bar") (buf |>)

but you cannot pass it to runBuffer, because it requires Buffer ⊸ Buffer.

Bodigrim · April 13, 2022, 6:16pm

I’ve uploaded a candidate package, rendered haddocks are available at Data.Text.Builder.Linear

jaror · April 15, 2022, 4:37pm

Why did you benchmark the Data.Text.Lazy.Builder.Builder type against your Buffer and not against your Builder? Your Builder is also faster than the standard builder, but slower than manipulating Buffers directly. And I don’t think your Builder interface requires linear types. Maybe you should warn that for the most performance users should use the Buffer type directly.

Bodigrim · April 15, 2022, 5:36pm

Sorry, this is not intentional: when I wrote benchmarks, there was no Builder interface yet, only Buffer. And yes, it’s expected to be a bit faster.

jaror · April 15, 2022, 8:34pm

Have you considered making runBuilder nonlinear, i.e. runBuilder :: Builder -> Text? I don’t think that introduces any unsafety.

Bodigrim · April 15, 2022, 8:45pm

@jaror done in Make runBuilder multiplicity polymorphic · Bodigrim/linear-builder@d3f3159 · GitHub

tomjaguarpaw · April 15, 2022, 9:31pm

I think you’re right, but does this actually achieve anything? Isn’t a -o b a subtype of (or at least has an injection to) a -> b?

Bodigrim · April 15, 2022, 9:43pm

Indeed, if you have

f :: a ⊸ b

you can always define

g :: a -> b 
g a = f a

So it is preferrable to mark linear functions as such, as clients can always downgrade to a normal arrow, if it fits their API better.

jaror · April 15, 2022, 9:50pm

I asked because I believe that it the only place where linearity is used in the Builder module (if you make the Builder type opaque), so that part of the API could just as well be added to the existing text package. That should already give a large speedup compared to the existing lazy text builder.

atravers · April 22, 2022, 12:07am

Hrm:

runBuilder :: (Builder %1 -> Builder) -> Text
runBuilder :: (Builder ⊸ Builder) -> Text

…anyone for plain ol’ boring:

runBuilder :: (*Builder -> Builder) -> Text

It’s concise and {-# UnicodeSyntax #-}-free.

Bodigrim · April 22, 2022, 10:41pm

@atravers I made runBuilder arrow-polymorphic, see updated Data.Text.Builder.Linear

Bodigrim · May 22, 2022, 4:31pm

Alright, last call for reviews. I’ve added a section on design.
https://hackage.haskell.org/package/text-builder-linear-0.1/candidate