R/haskell - Was simplified subsumption worth it for industry Haskell

ParetoOptimalDev · May 6, 2022, 4:10pm

Copying text from original reddit post:

Warning: Long post

tl;dr

simplified subsumption seems to make common code I write in industry clunky for no good reason in lots of places
Are others concerned with motivating what seem like “pointless lambdas” to new hires or students for simple tasks?
Are there more real world advantages that make these frequent annoyances worth it?
How is quicklook impredicativity useful in industry?

The biggest advantage seems to be that laziness is more predictable.

However looking at commits fixing simplified subsumption errors on github I see very common patterns in industry Haskell now need an explicit lambda for “reasons” such as:

  readFreqSumFile :: (MonadSafe m) =&gt; FilePath -&gt; m (FreqSumHeader, Producer FreqSumEntry m ()) 
- readFreqSumFile file = readFreqSumProd $ withFile file ReadMode PB.fromHandle
+ readFreqSumFile file = readFreqSumProd $ withFile file ReadMode (\h -&gt; PB.fromHandle h)

and:

- toOrders &lt;- asks _pdfConfToOrder 
+ toOrders &lt;- asks (\r -&gt; _pdfConfToOrder r)

And this typical use of id is no longer valid:

instance MonadOrvilleControl IO where 
-    liftWithConnection = id 
-    liftFinally = id 
+   liftWithConnection ioWithConn = ioWithConn
+   liftFinally ioFinally = ioFinally

On my $work codebase that means hundreds of changes that make our code worse with seemingly no benefit.

This case is addressed in the proposal, but seems to handwave this as:

> The benefit, in terms of programming convenience, is small.

From my perspective while updating my codebase, it certainly doesn’t feel that way.

From the persective of onboarding new Haskell hires, it doesn’t feel simpler. I envision a future teaching session like:

> student: This code looks correct but I get an error that means nothing to me of

 error: 
     • Couldn't match type: b0 -&gt; b0 with: forall q. q -&gt; q 
          Expected: p -&gt; forall q. q -&gt; q 
          Actual: p -&gt; b0 -&gt; b0 
     • In the first argument of ‘g’, namely ‘f’ In the expression: g f In an equation for ‘h’: h = g f | | h = g f | ^

> me: Ah, that’s because of something called simplified subsumption which we’ll cover much later.
> me: For now, just know putting it in an explicit lambda fixes it when you notice a compile error like that.
> me: Now lets try to move past that and get back to basic file reading and writing
> student: oookkkay? (feeling unsure, disillusioned about Haskell requiring pointless ceremony and being overly complex for no seeming benefit)

Being a fan of and proponent of Haskell I think: If this complication is being added, surely something is made possible in return that gives more value.

This led me to the proposal and I found with simplified subsumption:

Laziness characteristics and semantics of programs will be changed less, which I infer will lead to more predictable performance
I assume that simplifying a compiler step like this will speed up compile times and reduce space usage
Quick look impredicativity seems to be the real driving reason behind simplified subsumption and somehow makes dealing with very polymorphic code easier

At this point my thought is:

> Making highly polymorphic code simpler to write that isn’t as typical in industry Haskell code in ways I can’t determine without great effort was valued over “small incoveniences” that I’ll run into daily

But, still wanting to give the benefit of the doubt I dive face first into The proposal for Quicklook impredicativity.

Reading the whole thing, I still cannot ground this concept in real world terms that may effect me or that I could take advantage of.

So, I go to the paper A quick look at impredicativity and start reading many things I don’t fully understand.

Running out of energy, I start skimming and finally find some examples in section 10 APPLICATIONS.

I see an example with gZipWithM that I still don’t understand. Further down I see reference to pieces of code updated in Streamly that take advantage of quick look polymorphism and wonder why the real world example wasn’t included and explained.

So, i’m left frustrated with “simplified” subsumption and posting here for help answering:

Are others in the same boat?
Are there advantages i’m not seeing?
Can we use my reflection to improve industry/academic communication?

And finally, any revant commentary surrounding this I may be oblivious to.

Thanks!

jaror · May 6, 2022, 4:24pm

Here is a more detailed analysis and two potential solutions. Unfortunately both solutions have downsides of their own.

Take your first example:

readFreqSumFile file = readFreqSumProd $ withFile file ReadMode (\h -> PB.fromHandle h)

First of all, note the name of the commit: “Naively adapt to simplified subsumption”. That suggests that the authors still want to look for better solutions.

Digging a bit deeper, that fromHandle :: MonadIO m => Handle -> Producer' ByteString m () function depends critically on this definition:

type Producer' b m r = forall x' x. Proxy x' x () b m r

Such type synonyms with forall quantifiers should be considered an anti-pattern now with simplified subsumption. There was a similar issue reported to the GHC issue tracker: Possible work-around for implicit params and simplified subsumption? (#21168) · Issues · Glasgow Haskell Compiler / GHC · GitLab

Solution 1. Use newtypes

@simonpj suggests making these type synonyms into newtypes:

newtype Producer' b m r = MkProducer' (forall x' x. Proxy x' x () b m r)

If you implement all the required constraints for this type then you can just write the original:

readFreqSumFile file = readFreqSumProd $ withFile file ReadMode PB.fromHandle

But this is not quite a good solution here, because I believe you can’t automatically derive all the instances and you cannot compose these producers with other pipes.

This interoperability could probably be restored by using the same tricks that the optics library uses to get their lenses to compose, but that seems like quite a big change here.

Solution 2. Rewrite type synonyms in place

Another simple change to resolve this is to just rewrite the type synonyms in-place:

fromHandle :: MonadIO m => Handle -> Proxy x' x () ByteString m ()

Then you can also just write the original non-eta-expanded version. But a problem is that it is harder understand that the Proxy in this case really must be a producer.

So what then?

Edit: Actually, what I propose here is rife with edge-cases, see this comment on the simplified subsumption pull request

One thing that comes to my mind with increasing frequency is a way to abstract over patterns like that Proxy x' x in a different way. Currently you can do two things: make these variables parameters of the type:

type Producer' x' x b m a = Proxy x' x b m a

or you can use a forall to bind the variables:

type Producer' b m a = forall x' x. Proxy x' x b m a

Sometimes neither is ideal.

What if we could have other abstraction mechanisms, such as a way to say that these variables should always be floated out to the binding the type belongs to. A question remains whether the variable should be shared across all occurrences of this type synonym or not. In this case we always want these variables to be as free as possible, so it should always rewrite multiple occurrences of the synonym like this:

x :: <...> Producer' b1 m1 a1 <...> Producer' b2 m2 a2 <...>

To:

x :: forall x1' x1 x2' x2. <...> Proxy x1' x1 b1 m1 a1 <...> Proxy x2' x2 b2 m2 a2 <...>

Another example of a use case where this would be useful, but with slightly different requirements is a type synonym for semigroups. In a function with signature like this:

append :: A -> A -> A

You might want to instantiate A to [a] to get:

append :: forall a. [a] -> [a] -> [a]

As you can see the type variables should be floated out, but this time they should be shared between each occurrence of the synonym A.

This is not something you see every day in Haskell code, but it comes up when writing backpack signatures. For a more complicated real world example where this came up for me, see: Add `primitive` as a boot dependency (#20299) · Issues · Glasgow Haskell Compiler / GHC · GitLab

atravers · May 6, 2022, 8:49pm

This seems relevant:

Why is the programmer being asked to do what looks to me like a job for the implementation (eta-expanding partial applications and parameters which are functions)?

jaror · May 6, 2022, 9:07pm

In Haskell eta-equivalence simply doesn’t hold (and it never has), at least not in its full generality. Perhaps the ideal solution would be to move towards an implementation that does really support eta-equivalence, but that still requires significant research and the possibly negative side effects are unclear:

Change GHC’s internmediate language to base it on System F-eta (Mitchell, 1988). In this language, everything is done modulo eta expansion/contraction. Apart from the huge engineering consequences, it’s not clear that F-eta as an intermediate language is compatible with Haskell, which distinguishes bottom from (:raw-latex:\x.bottom).

atravers · May 6, 2022, 10:20pm

Alright, the original subsumption algorithm should then be restored because one of the premises for the change mentioned in the proposal:

But actually all we get in exchange is a minor improvement in programming convenience.

…is plainly false. Alternately, those advocating for this “simplification” can build a tool which can add the eta-expansions to existing code.

As for it reducing implementation complexity: runtime memory management (usually in the form of garbage collection) is also complex to implement - shall we also delegate that task back to the (presumably) all-knowing programmer?

Also from the proposal:

Here is a thought experiment. Suppose GHC lacked all four features, and someone proposed adding them. That proposal would never leave the launchpad.

…by that reasoning, type classes would have never been added to Haskell due to - yes, you guessed it - the problem of orphan instances.

No substantial programming language is ever going to get everything correct from the outset, and Haskell is no exception.

jaror · May 6, 2022, 10:59pm

Alright, the original subsumption algorithm should then be restored because one of the premises for the change mentioned in the proposal: […] …is plainly false.

I don’t think there is a process for proposals to automatically get reverted if the assumptions turn out to be false. You can open a new proposal to reintroduce deep subsumption.

As for it reducing implementation complexity: runtime memory management (usually in the form of garbage collection) is also complex to implement - shall we also delegate that task back to the (presumably) all-knowing programmer?

This is an unhelpful hyperbole. Removing GC would break 100% of Haskell code. With simplified subsumption only some of the code that depends on the RankNTypes extension was broken.

…by that reasoning, type classes would have never been added to Haskell due to - yes, you guessed it - the problem of orphan instances.

Type classes are part of the standard. Deep subsumption isn’t as it relies on the RankNTypes extension.

No substantial programming language is ever going to get everything correct from the outset, and Haskell is no exception.

I’m a bit confused by this, are you saying that fixing mistakes of the past, like simplified subsumption was intended to do, is a good thing? If not, what incorrect thing are you referencing?

ParetoOptimalDev · May 7, 2022, 1:22am

Thank you… I felt like I was being unreasonable for thinking this.

Plus mostly everyone keeps repeating how small of an inconvenience it is but I don’t quite understand how.

atravers · May 7, 2022, 3:02am

for your definition of “some” : clearly the OP disagrees.
RankNTypes must surely be one of the oldest (if not the oldest of the) extensions - how long were programmers supposed to wait for it to stabilise (particularly now it’s part of GHC2021)?
Maybe by itself this change seems small, but add in the various other adjustments to various other extensions…the initial enthusiasm for Haskell can give way to despair in trying to keep up with all those changes.

Now allow me to clarify:

No substantial programming language is ever going to get everything correct from the outset, and Haskell is no exception.

…when Haskell was a smaller language and had a smaller user base (the “hyenas” fond of “red meat” as described in A History of Haskell), dealing with the adding and removing of new and old extensions was more-or-less accepted: Haskell and its implementations were being used to more intensively explore the possibilities afforded by features like non-strict semantics, typefulness, and the like.

But this research-driven form of development proves troublesome for production purposes (unless the new extensions are only ever added and then left undisturbed - a strategy which requires perfect foresight, and therefore impossible to correctly implement) and at some point in time there will be so much “legacy Haskell code” that making changes such as this supposedly-simple subsumption algorithm will cause widespread breakage.

Nitpicks - HaskellWiki
the presence of head, tail, and seq amongst others.

I admit: that was a provocative suggestion. I’m using GC as an example of a more general observation: complexity has to go somewhere - for Haskell, that means it’s either going to be in the language or the implementation:

if it’s in the language, an “interesting” variety of ad-hoc site-specific semi-solutions usually appear as a result.
if it’s in the implementation, one well-researched general-purpose actual solution is much more likely to be found.

Are the problems with the complexity of deep subsumption inherent to the algorithm, or could it be just a symptom of the larger problems with the default Haskell implementation of choice?

…so were (n+k) patterns and the annoying monomorphic restriction.

sclv · May 7, 2022, 4:13am

Now that I’ve actually understood what code patterns require the expansion – those which make use of type aliases of the form type Foo = forall x. ..., I’m convinced that GHC probably shouldn’t have been allowing these aliases to begin with, and all the problems arising are from a misfeature that people abused.

I’ve never used aliases of this form, and been mystified by them because GHC’s treatment of them seemed confusing and sort of magical. Reading through the related discussion that now exists, I’m relieved that my confusion wasn’t due to me just “not getting” something obvious, but in fact there was always something not quite coherent going on. The cost to end users of a discipline that forces those aliases to be of the form type Foo x = ... seems very minor compared to the confusion that arises otherwise. The aliases let you pretend you can hide that x from end users – but you can’t really, you can only sometimes make it hard for them to see what’s going on.

I’m glad this band-aid was ripped off, and disappointed that this construction became so widespread in the Haskell world to begin with.

jaror · May 7, 2022, 6:41am

In particular, there are two further conditions:

The forall quantifiers of a function and its argument must not match up
The argument must be eta-reduced

Also, while the breadth of breakage is large, the fix is technically as easy as eta expanding a function.

Note that the breakage is not limited to that specific situation. There are also some issues with foralls in the types of record field selectors. And also other less specific cases. Although I think that the type synonym breakage is the least satisfying because the fix breaks encapsulation of the synonym.

atravers · May 7, 2022, 7:15am

That is fortunate:

Alternately, those advocating for this “simplification” can build a tool which can add the eta-expansions to existing code.

…almost thirty year’s worth of existing code: enjoy.

@simonpj suggests making these type synonyms into newtypes:
newtype Producer' b m r = MkProducer' (forall x' x. Proxy x' x () b m r)

Could this be automated to the point where every type synonym is translated into a newtype declaration with an anonymous constructor?

it could help to make this breaking change less injurious,
and type synonyms could be used more easily in instance declarations.

tomjaguarpaw · May 7, 2022, 9:01am

Yes, I had to do

-    [ pure (Arrow.arr . fmap . unArbitraryFunction) ]
+    [ pure (Arrow.arr . fmap . (\x -> unArbitraryFunction x)) ]

where

newtype ArbitraryFunction =
  ArbitraryFunction { unArbitraryFunction :: forall m i b s.
                      Functor m => Choices m i b s -> Choices m i b s }

This surprises me because one of SPJ’s proposed solutions is to wrap in a newtype. But I was doing the good thing all along! I’ve never had this type outside a newtype.

jaror · May 7, 2022, 9:08am

Here’s the explanation: the types of record field selectors is actually misleading. When you write:

newtype ArbitraryFunction =
  ArbitraryFunction { unArbitraryFunction :: forall m i b s.
                      Functor m => Choices m i b s -> Choices m i b s }

The type of the selector is actually:

unArbitraryFunction :: 
  ArbitraryFunction -> 
  (forall m i b s. Functor m => Choices m i b s -> Choices m i b s)

Notice how the forall quantifier is now no longer at the outermost level in this type signature, so this is a higher rank type which causes problems with the simplified subsumption.

A simple fix for this case is to change the selector into a proper function with a simpler rank 1 type:

newtype ArbitraryFunction =
  ArbitraryFunction (forall m i b s. Functor m => Choices m i b s -> Choices m i b s)

unArbitraryFunction :: 
  forall m i b s. Functor m =>
  ArbitraryFunction -> 
  Choices m i b s ->
  Choices m i b s
unArbitraryFunction (ArbitraryFunction f) x = f x

Now this no longer requires RankNTypes and it works properly with simplified subsumption.

tomjaguarpaw · May 7, 2022, 9:12am

Thanks, that’s a nice explanation. I like that one can write unArbitraryFunction once in eta-expanded form and then use it everywhere without expanding.

jaror · May 7, 2022, 9:22am

I don’t mean to imply that putting these synonyms in a newtype is always a good solution. See my caveats:

But this is not quite a good solution here, because I believe you can’t automatically derive all the instances and you cannot compose these producers with other pipes.

That lack of composability is an enourmous problem for most occurrences of these type synonyms. The whole point of pipes is that they compose well.

I think the main takeway here is that with the new simplified subsumption you cannot abstract over type variables in the same way as with the old deep subsumption. That is a big problem for library writers that used deep subsumption to achieve both encapsulation and interoperability.

Now library authors will have to find other ways to achieve that, which I think is possible like the optics library does (but it wont be exactly the same). But that is an invasive change in existing libraries that should have been anticipated when the proposal was under review. Saying you can just eta-expand everywhere is a too simplistic, I agree.

vmchale · May 7, 2022, 12:07pm

Are there any examples of such implicit coercions doing the wrong thing in practice? The original proposal did not mention as far as I can tell.

I’d be curious to see what it did.

jaror · May 7, 2022, 12:33pm

Note that they are not zero-cost coercions, but rather casting similar to implicit number type conversions in some other programming languages (but potentially with arbitrarily large overhead).

There are examples in the proposal, but they obviously don’t really occur as such in practice.

I don’t think wrong behavior has ever been reported. And even if it did occur, it might not have been noticed or it might not have been traced back to this deep subsumption. The problem is really very subtle.

cjay · May 9, 2022, 8:09pm

One consequence: I think the implicit eta-expansion could have turned some usages of seq into placebos. I’m not sure how common it is to force a function value though. Possibly after pulling the original function out of some data structure that you want garbage collected, or when recursively applying function composition many times, forcing the function value to avoid a space leak?

lexi.lambda · May 10, 2022, 10:21am

I already posted a somewhat lengthy comment on the /r/haskell thread about my opinions on this change. But since the discussion here seems a little more focused on the semantic changes, I want to elaborate on those a little further.

I think it’s hard to argue that the semantic changes created by implicit eta-expansion are unacceptable given the existence of the -fpedantic-bottoms flag, which is disabled by default. For the lazy, here is the description of what it does from the link:

Make GHC be more precise about its treatment of bottom (but see also -fno-state-hack). In particular, stop GHC eta-expanding through a case expression, which is good for performance, but bad if you are using seq on partial applications.

So the optimizer sometimes implicitly eta-expands by default, anyway, and I’ve never seen absolutely anyone actually use -fpedantic-bottoms in practice, so clearly this is not a problem for real Haskell programmers.

jaror · May 10, 2022, 12:10pm

That is very interesting, I didn’t know about that. I guess instead of unacceptable, I would say that changing the semantics to be avoided if the cost of avoiding it is not too high.

In the case of simplified subsumption, I now also think that the solution of eta-expanding everything is not as trivial as it was made out to be in the proposal. The fact that this cost must for a big part be paid by the users of libraries and not just the library authors makes it especially hard to justify. So perhaps the costs were too high after all. At least for that line of argumentation.

Nevertheless, I think higher rank types were always a bit broken in Haskell due to the lack of impredicativity. Higher rank types were never really first class. For example, you could make a type synonym:

type A = forall a. a -> a

This synonym is a leaky abstraction. If you have a type A then I’d say everyone expects that they will also be able to write [A]. The lack of impredicativity means that this intuition can be broken by any type. You’d have to look at the definition of the type to be able to tell how you can use it. So impredicative types might be worth the cost of the simplified subsumption.

Ideally, we would like to have both powerful subsumption and impredicative types, which, from reading the Quick Look paper, seems to be possible in a system called MLF. It is remarkably powerful and robust, see the for example this code from section 4.2:

xs1 = []          :: ∀α. [α]
xs2 = const : xs1 :: [∀α β. α → β → α]
xs3 = min : xs2   :: [∀α (Ord α). α → α → α]
xs4 = (<) : xs3   :: [Bool → Bool → Bool]

(I believe all those type signatures can be inferred)

The big problem is that MLF requires major changes to GHC’s existing type checker and it is not even clear that all of GHC’s features are compatible with MLF. So, is anybody interested in writing a proof of concept type checker for a subset of Haskell based on MLF?