Reflecting Code to Syntax (with Arrows)

tomjaguarpaw · February 3, 2026, 6:48pm

Not sure what you mean exactly, but Opaleye uses arrows (or monads) to represent database queries that are “serialised” to SQL so maybe that’s evidence that what you think is an impediment is not actually one?

enobayram · February 3, 2026, 10:36pm

I believe what they mean is that even for a simple arrow like:

proc x -> do
  y <- f -< x
  z <- g -< x
  returnA -< (y,z)

You won’t get a desugaring of f &&& g, but it will be desugared to something that has arr fst etc. sprinkled all over the place. The desugaring loves arr so much that you won’t be able to extract any structure whatsoever that doesn’t have an arr inside it. So if you were hoping to find a &&& and compile it straight to something in your domain, you’ll be disappointed. IMO, arrow notation would’ve been so much better if it never produced arr unless you used a non-routing transformation in your proc do notation. And that could be a superclass of Arrow and it would be awesome like ApplicativeDo…

That said, I believe, in your SQL compilation, you’re avoiding the arr issue by feeding an expression value to the arrow and inspecting the output, so your reflection technically happens at runtime and not desugaring time.

stevan · February 4, 2026, 6:00am

Yes, Conal Elliott’s concat plugin and Oleg Grenrus’ overloaded plugin being workarounds.

enobayram · February 4, 2026, 7:06am

I admire concat but never felt brave enough to actually use it for anything, have you been using it as a backbone of anything serious? Did you experience bugs when your use case started to get more complicated?

stevan · February 4, 2026, 7:23am

No, I never got that far. I recently stumbled across Arrows to Arrows, Categories to Queries :: Reasonably Polymorphic and via the comment there I found: Feature Request: Lambda Calculus Notation · Issue #9 · jameshaydon/lawvere · GitHub . I haven’t had time to dig into it properly though.

tomjaguarpaw · February 4, 2026, 8:08am

Yes it does, though I’m not sure why that would be insufficient for @stevan’s goal of achieving serialization:

I may be misunderstanding the goal, however.

stevan · February 4, 2026, 8:34am

(I’m not aware of how Opaleye works under the hood to solve this.)

tomjaguarpaw · February 4, 2026, 10:26am

It’s because all types that Opaleye SelectArr arrows can meaningfully manipulate are abstract, of the form Column sqlType, where Column is defined something like

newtype Column sqlType = MkColumn String

and the string is a reference to a variable defined in an AST that the SelectArr builds. The definition of SelectArr is roughly

newtype SelectArr a b = MkSelectArr (Kleisli (State AST) a b)

jaror · February 4, 2026, 12:39pm

I don’t mean to play down your work, but I believe defining arrows in terms of Kleisli means you can’t gain any extra static analysis power compared to just using the monad directly. Have you considered using the state monad directly, and do you actually experience benefits of the arrow-based definition?

I’m also thinking of accelerate which has nice syntactic sugar such that users just write the AST directly without even realizing it. They use the same trick of opaque types that are actually just variables, for example to have overloaded pattern matching.

tomjaguarpaw · February 4, 2026, 1:53pm

Yes, that’s correct. Monads are equally inspectable using this technique. An analogous approach is documented in Simple and Compositional Reification
of Monadic Embedded Languages.

Yes, you that’s equivalent to just using the Monad instance of SelectArr () (which is called Select). However, that’s not the same as using the underlying Monad instance of the State AST. Instead it uses samething called lateral which allows mapping Select to SelectArr:

lateral :: (i -> Select a) -> SelectArr i a

and which performs an SQL “lateral join”.

If you write your Opaleye database expression purely using arrow combinators then you know statically that it doesn’t use lateral (or indeed anything else that introduces a lateral join). I don’t actually know whether there’s a downlide in Postgres to using a lateral join, but some DBMSes don’t support them, so I kept this distinction around as a matter of technical interest.

You can see this all implemented at https://hackage.haskell.org/package/opaleye-0.10.3.1/docs/src/Opaleye.Internal.QueryArr.html (the types are slightly different to what I’ve described here).

jaror · February 4, 2026, 2:40pm

That’s a cool paper! I guess you could even make it safe if you make the EBool type opaque and only expose functions to construct it. Pattern matching on EBool would lead to exotic terms, which break the compilation. That does seem more straightforward than PHOAS which is usually my go to approach to make HOAS safe.

Ah, I feel like you might be able to do the same with an applicative interface. I have posted about that idea before:

I wish I had the time to really put that to the test. I really wonder if there is a gap between monad and applicative where arrows do add something. It feel a bit like P vs NP, but then Applicative vs Arrow.

tomjaguarpaw · February 4, 2026, 3:12pm

Almost, but you can’t do restrict with an Applicative because it needs a possibly-data-dependent argument:

restrict :: SelectArr (Field SqlBool) ()

(restrict translates to SQL’s WHERE.)

jaror · February 4, 2026, 3:38pm

In my free applicative approach you’d write that like this:

data Select a where
  Restrict :: Select (Field SqlBool) -> Select ()
  Pure :: a -> Select a
  Ap :: Select (a -> b) -> Select a -> Select b
  -- ... other operators

tomjaguarpaw · February 4, 2026, 3:49pm

I’m not sure it makes sense semantically to have the argument be a Select. How do you envisage using it?

jaror · February 4, 2026, 4:08pm

I don’t have much experience with opaleye, but let’s take this example from your tutorial:

personAndBirthday ::
  Select (Field SqlText, Field SqlInt4, Field SqlText, Field SqlDate)
personAndBirthday = do
  (name, age, address) <- personSelect
  birthday             <- birthdaySelect

  where_ $ name .== bdName birthday

  pure (name, age, address, bdDay birthday)

For this we assume personSelect :: Select (Field SqlText, Field SqlInt4, Field SqlText) and birthdaySelect :: Select (Field SqlDate). Then we can write the personAndBirthday selection using my proposed free applicative like this:

data Select a where
  Restrict :: Select (Field SqlBool) -> Select ()
  Pure :: a -> Select a
  Ap :: Select (a -> b) -> Select a -> Select b

instance Functor Select where ...
instance Applicative Select where ...

personAndBirthday ::
  Select (Field SqlText, Field SqlInt4, Field SqlText, Field SqlDate)
personAndBirthday =
  let person = personSelect
      name = personName <$> person
      age = personAge <$> person
      address = personAddress <$> person
      birthday = birthdaySelect
  in
  Restrict ((.==) <*> name <*> (bdName <$> birthday)) 
    *> (,,,) <$> name <*> age <*> address <*> (bdDay <$> birthday)

Of course this syntax is not ideal. I’m sure some things (like that embedded pattern matching I mentioned before) can be done to make it more readable. But I hope this illustrates my idea.

tomjaguarpaw · February 4, 2026, 4:33pm

Can you use this syntax whilst ensuring that the person table is not joined to itself 3 times? If so then conversely: if you did want to join the person table to itself how would you do that?

jaror · February 4, 2026, 6:58pm

Isn’t it the job of the query optimizer to figure out which joins should be performed? Sorry, it’s been a few years since I had a database course…

If you’re referring to the more general loss of sharing, there’s work on sharing recovery by Gill which was extended by McDonell et al. for accelerate.

tomjaguarpaw · February 4, 2026, 7:06pm

I mean there are two possible semantics, one which Cartesian products person to itself 3 times and one which doesn’t. That’s not a question of optimization, it’s just unclear which semantics your snippet gives. (I probably shouldn’t have originally said “join”. I really meant Cartesian product.)

I think this will all become clear once you try to implement it!

jaror · February 4, 2026, 9:26pm

I see, it’s like the difference between

let x = [1..10] in (,,) <$> ((+ 1) <$> x) <*> ((+ 2) <$> x) <*> ((+ 3) <*> x))

and

(\x -> (x + 1, x + 2, x + 3)) <$> [1..10].

I didn’t realize this because FRP is more like a ZipList which is idempotent, in the sense that duplicating the result of sampling a Behavior is the same as sampling the Behavior twice: (\x -> (x, x)) <$> b = (,) <$> b <*> b.

Back to our example. Without Restrict, then, you’d write the example like this:

personAndBirthday ::
  Select (Field SqlText, Field SqlInt4, Field SqlText, Field SqlDate)
personAndBirthday =
  (\p b -> (name p, age p, address p, bdDay b) <$> personSelect <*> birthdaySelect

It doesn’t seem easy to fit Restrict into that. My first thought is that you’d instead want a Filter:

   Filter :: (a -> Field Bool) -> Select a -> Select a

Which you’d be able to use like this:

personAndBirthday ::
  Select (Field SqlText, Field SqlInt4, Field SqlText, Field SqlDate)
personAndBirthday =
  (\(p, b) -> (personName p, personAge p, personAddress p, bdDay b) <$> 
    Filter (\(p, b) -> personName p .== bdName b) 
           ((,) <$> personSelect <*> birthdaySelect)

turion · February 6, 2026, 3:05pm

The best answer I know to this question is given in https://homepages.inf.ed.ac.uk/wadler/papers/arrows-and-idioms/arrows-and-idioms.pdf So yes, arrows are more expressive.