Homomorphic static analysis

Applicatives allow performing static analysis, while monads do not. But what does it mean in practice? Why is it so? And what kind of static analysis is this referring to? And what happens if we use categories instead of monads?

These are some of the questions I’m trying to provide an answer to in my latest blog post: Homomorphic static analysis

6 Likes

Enjoyed reading through!

Noticed a minor typo: the latter does not admit any kind of static analysis, while the latter does.

It would be nice to have a clearer link to your previous post; I think it’s a good idea to obviously link together these posts (and Chris’s) since they all work together to give a clearer picture.

Thanks for mentioning the typo. I fixed that and added links to Chris posts

Great post! The last section seems to come very close to Conal Elliott’s Compiling to Categories. In fact, I wonder if we take your record of morphisms and use a type class instead if they would be the same.

yes, the idea is pretty similar. It’s just that the categories you can use to do static analysis are simpler than the ones you would actually use to compile to categories

Hey @marcosh,

great post, thank you. Due to the posts by you and others I have recently looked more into category/arrow based programming. Now I have a question on which I’d like to pick your brain (or anyone elses in the community.)

I really like the concept of Alternative and Parallel like you used them and how you drew the diagram. This strongly suggested to me that strong profunctors (/arrows) are a way to express parallelism. But I hit a problem with that:

I assume your implementation of f &&& g is something like lmap dup (first' f >>> second g').
Here morally first' f and second' g are “parallel”, you drew it like that in your diagram and that makes sense because f and g have no data dependency.

But now assume that we run this in a profunctor which has observable side effects. e.g. a newtype around Kleisli IO where we would want &&& or *** to use concurrently.

Now we have two choices

  1. either we implement *** so that f *** g /= first' f >>> second' g because we implement *** truly parallel while first' f >>> second' g executes f and g sequentially. This is unsatisfying because it a) breaks the intuition implied by your diagrams and b) seems at least dubious for Arrows where *** is defined as first f >>> arr swap >>> first g >>> arr swap and it says “you may implement a more efficient version”. Which is not documented as a law but to me implies that we should have a law that f *** g == first f >>> second g.

  2. we find an implementation of our profunctor which truly fuses first' f >>> second' g and runs f and g concurrently. I tried that and it is possible. However what I couldn’t manage to do was to achieve fusion in a way that first' f >>> second' g /= first' f >>> lmap id (second' g) It seems impossible to me because there is now way to detect whether lmap f introduces a data dependency between f and g. Now that is a problem because Profunctor documents the law lmap id == id.

(I looked into this and my impression is that 2. is only really possible for “Causal Commutative Arrows”, which is quite restrictive.)

So my take away is, that observably parallel arrows cannot do anything interesting when their Profunctor instance is not trivial (like in your static analysis which has a phantom parameter.) So profunctor based programming seems to be fundamentally sequential to me.

What do you think?

1 Like

Yeah, that was basically also the premise of this talk:

However, I don’t understand why you would want this pointfree style and not just use the tried and true approach of (pointful) syntax. You can just define a data type like:

data Program
  = Parallel Program Program
  | Sequential Program Program
  | Var String
  | Lambda String Program
  | App Program Program
  | Print String
  -- ... etc

If you want you can even make this intrinsically typed and extensible.

Is it just because arrows have some built-in syntax? In that case it sounds simpler to me to extend GHC with more overloaded syntax (like Conal Elliott did with his work on compiling to categories).

1 Like

Thank you so much! Will definitely watch that.

Question however: Can you make a bit more explicit what you mean with me aiming for “pointfree” style? My reasoning to try this is that a) profunctors/arrows felt like the reasonable abstraction to do this and b) the existence of arrow syntax, true.

The problem with monads for the purpose of static analysis is (from a certain perspective) that the monadic bind function >>= is not point-free. Its second argument could be a lambda that binds a variable. Switching to (Kleisli) categories means you only allow function composition and identity functions (both pointfree). Strong and Choice are similarly pointfree. The Arrow class allows lifting pointful functions to your category, but they are confined to the arr method (and even this confinement is not enough for all analyses as you write in your comment).

The arrow syntax does find a nice subset of syntax where the compiler can translate pointful programs to the pointfree arrow style, but this introduces a non-trivial translation which in my opinion complicates everything. For example, it is impossible to (exactly) pretty print programs written in arrow notation to their original syntax.

Instead, I think we should just reckon with overloaded variable binding once and for all and avoid the complicating pointfree translation.

1 Like

Well, I agree that pointful Monads are a nice and useful abstraction most of the time.

However I find the idea of static inspectability quite intriguing. For that I know currently 2 abstractions: Arrows and Selective. The problem with Selective is a) no syntax support and b) effects cannot depend on inputs at all, which means you can’t even do putStrLn in Selective.

So, again sorry, I am not sure I can follow you here. (Especially the point “overloaded variable binding”) Are you saying I should just bite the bullet, and give up inspectability so that I don’t need “pointfree” programing? Or are you saying there is a better abstraction to do this?

(I have the inkling that I should just read “compiling to categories” and then maybe I understand what you mean with overloaded variable binding, but not sure …)

There are two concerns:

  1. Finding an abstraction that can capture the semantics that we want to express.
  2. Being able to write it with nice syntax.

For point 1 I think normal ASTs like the Program data type in one of my previous comments fits the bill (perhaps with more type safety but that is a solved problem for simple types). If you want it to be overloadable you can make use the “Tagless-Final” style, but I don’t care much for that.

I think the only reason this is not a standard approach is because of point 2: GHC does not have a nice syntax for it. You can get far with RebindableSyntax, but you run into a wall when you have language constructs which bind variables: lambda, let, and case expressions. Those are all pointful and I assume that is the reason why they are not rebindable yet. With reckoning with variable binding, I mean that we should just choose a representation and make these things rebindable too. I don’t think it is very hard to do that, but I also don’t think there is a nice category theoretic abstraction to capture these variable-binding constructs.

1 Like

Hi!

Nice question! And I need to admit that I don’t have a clear answer.

Just two thoughts:

  1. We wouldn’t have this issue if Strong was defined in terms of *** instead of first' and second' (and similarly for Choice defined in terms of +++).
  2. One potential solution to the issue is to have an intermediate step to a data structure like
data StrongData a b =
  Fun :: (a -> b) -> Fun a b
  Compose :: Fun b c -> Fun a b -> Fun c d
  DiMap :: ...
  First :: Fun a b -> Fun (a, c) (b, c)
  Second :: Fun a b -> Fun (c, a) (c, b)

and every time you have a Compose (First ...) (Second ...) you optimize it to a parallel computation

Nice post. @marcosh where do you think relative monads feature in this?

Yes, that’s exactly what I tried and which I couldn’t figure out. I need to somehow detect that Compose (First ...) (Second ...) is the same as Compose (First ...) (Fun id (Second ...)) unless I want to hurt Arrow/Profunctor laws and I have no idea how to do that, because I can’t do an equality check on a -> a.

1 Like

The overloaded package provides compiler plugins, one of them is for better Arrow desugaring and avoiding the arr combinator.

You could use this, if you’re willing to depend on a GHC compiler plugin. You could have overloaded:categories desugar into your own Operation DSL, which may then implement Eq.

Yes, after the feedback here it seems relatively clear that Arrows are fundamentally sequential and for truly parallel diagrams I am on the look out for other solutions and the most convenient syntax for them. It seems plausible to me that compiling into a bicartesian category like overloaded does is the right approach. Next question is which bullet to bite syntax wise. A plugin makes sense. “Compiling to categories” would be an option. But of course not requiring any of them would be nice: I am still pondering whether it might be possible to use something like the trick in linear-smc to actually use vanilla arrow syntax to write bicartesion categories.

I am not super happy that all bicartesian categories I can find on hackage have type parameters for products and coproducts instead of hardcoding (,) and Either.