MILU and force in Purescipt warning logger

Forking off a thread to continue the comment at Deepseq versus "make invalid laziness unrepresentable" - #3 by rhendric


Before responding to @rhendric’s comment I think it will help if I clarify my point of view, to avoid talking at cross purposes. The hill that I am willing to die on is that the correct way to fix the program in Optimiser performance problems is to make invalid laziness unrepresentable (MILU), and that force/deepseq and bang patterns are incorrect. I also extend this claim to a wide variety of similar programs. However, I do not claim that

  1. force/deepseq has no uses
  2. Bang patterns have no uses
  3. MILU is the way that all space leaks should be fixed (although I’d like to collect instructive such examples)

The reason that I believe it’s important to provide a very simple characterization of the solution to a wide variety of space leaks is that “laziness causes difficulty reasoning about time and space usage of Haskell programs” is a very common refrain from Haskell detractors in online discussions, and dispelling this myth is one of the most important hurdles to overcome if we want Haskell to be adopted more widely.


Now on to the comment in question. Thanks, @rhendric, for the detailed reply! I agree that it is very difficult to make invalid states unrepresentable, in general. In fact I wouldn’t be surprised if it was even non-computable. I’m less convinced that the same is true of making invalid laziness unrepresentable. It’s a far simpler problem space. However, I am aware of some sophisticated datastructures that seem to make use of laziness in an essential way, and have constraints upon how that laziness should appear at run time that are hard to enforce statically. Examples would be Okasaki’s, I don’t understand them well enough to say more.

I’d like to collect examples where MILU is not, even in principle, the correct solution. Is the Purescript warning logger such a case? Maybe! Let’s see.

I’d like to challenge this assumption, firstly because if there are good reasons then I’d value being enlightened about them, and secondly because it seems to be implying “assume we don’t want to MILU”, which defeats the point of the discussion. Perhaps the reason is that with a lazy expression tree, some traversals can naturally “fuse” into one overall traversal, improving performance. If that’s so it would be good to come up with a case where we can demonstrate the performance benefit of laziness.

To fix the problem of bigger-than-necessary memory usage, it seems like there a few options

  1. Make MultipleErrors NFData and then force it, in a carefully chosen place.
  2. Make tell and all the fields of everything in MultipleErrors and its dependencies strict.
  3. Make a parallel hierarchy of ...Strict types, and convert to them instead of using force

The current approach is 1.

As a brief aside, without 1, the example in question seems to cause a linear increase in memory consumption. Personally, I am primarily interested in cases where laziness causes space leaks of asymptotically more memory than expected. At least, I prefer to spend my efforts on those first, since they cause the most problems. In fact I’m doubtful whether we should include cases where the increase in space usage is only linear under the terminology “space leak”. Typical definitions, such as

A space leak occurs when there exists a point in the computer program where it uses more memory than necessary” are broad enough to include such linear increases, but also broad enough to include many other things that I don’t think we would actually want to include.

Anyway, the terminology part isn’t too interesting, but I think it’s worth pointing out that the Purescript warnings logger example only seems to consume linearly more memory.

To return from the aside, I would say that a single, well-placed force in the codebase doesn’t seem particularly risky. In fact I think it’s easy to understand exactly how it works. I would be more concerned if there were many uses of force in a wide variety of places in the codebase, and it was unclear which of them were redundantly doing work that had already been done previously.

If you really must do that kind of then then @ekmett has a cool trick called Once which combines deepseq with MILU. It indicates in the type system the property that a second redandant deepseq won’t actually do any work. That may be as inconvenient in practice as option 3 though.

I would look more carefully at why 1 is not possible. If there really is some essential benefit to making Expr strict then I think that would be a very interesting discovery to write up! If there isn’t, then I’d just make it all strict. However, if I was short of time I’d just accept the status quo. One well-placed force in the codebase doesn’t seem like an emergency.

1 Like

Agreed. But consider this hypothetical [scenario] - if Haskell was parallel by default, and that included a “measured dose” of speculative parallelism of some form, would (so-called) “invalid laziness” even be (that big of) an issue?

The aside first: the expected memory use of the compiler, oversimplifying greatly (such as by assuming that module size and complexity are constant), is O(k), where k is the number of threads used. Without force, the memory use is O(n), where n is the number of modules being compiled that emit leaky warnings. This is asymptotically more than expected (O(n) vs. O(1)), and the constant factors are in the gigabyte range, so even if it’s ‘only’ linear the result is that (per the original report) the compiler wants 30 GB when we expect it to want 2 GB. I don’t know what you’d want to call that if not a memory leak, but I consider it severe enough to warrant my attention.

I don’t know why it would imply that! I’m suggesting that some laziness is valid in parts of my program, and from that you conclude that I don’t want to make invalid laziness unrepresentable? I think I might not understand your thesis as well as I thought.

Making invalid state unrepresentable is compelling because you don’t have to give up any of your valid states. If someone coming from a MISU perspective tried to argue that a state I want to handle oughtn’t to be considered valid, I don’t think disagreeing is tantamount to saying that I don’t want to MISU!

But it sounds like maybe you believe that any laziness, if it isn’t justified by an Okasaki-style amortized cost analysis, should be considered invalid by default? Which somewhat contradicts

‘Laziness doesn’t cause problems; using laziness causes problems’ strikes me as a very fine hair to split, which makes me think I’ve misunderstood you. Telling a new Haskell adopter not to make anything lazy unless they already know the reason they need it is functionally identical to telling adopters not to use laziness, because of course they aren’t going to know why they might benefit from it—they’re new!

If this were a cheap (< 1 day of work) experiment to try, I’d do it and report back. But Expr and Type and all of the types they depend on contain a bunch of lists that would need to be converted to strict vectors, maps that would need to be used strictly, pairs and Maybes and Eithers that would need to be swapped for strict counterparts out of one of the packages that do this (and, inevitably, need to get missing instances and missing API functions reimplemented), and that’s too much work for me to do just to get the answer to this question.

Unsatisfying conclusion, I know. But that’s why I think it’s important to tell Haskell beginners about what deepseq can do for them. Because real-world Haskell programming will often involve years-old projects with hundreds of modules that can’t all be converted to strictness on a hunch, and ‘make it work’ comes before ‘make it right’ and ‘make it fast’.

1 Like

Sorry, I miscommunicated here. I should have said

I’d like to challenge this assumption, firstly because if there are good reasons then I’d value being enlightened about them, and secondly because if there aren’t good reasons then it seems to be implying “assume we don’t want to MILU”, which defeats the point of the discussion.

Does that resolve the remaining issues related to that sentence?

This particular issue is quite subtle. To reiterate and clarify further, my main goal of promoting MILU is to establish a shared understanding of a characterization of programs which largely eliminates the performance and memory usages downsides of laziness whilst largely preserving the compositionality benefits. In particular I want to dispell the reputation that Haskell has that it is hard to reason about performance and memory usage because Haskell is a lazy language. Here are some examples from Algolia’s Hacker News search:

I often hear about performance/memory usage pitfalls with Haskell laziness

https://news.ycombinator.com/item?id=1456979

purely functional design … makes it hard to reason about performance (especially memory, and most especially if you throw laziness in the mix).

https://news.ycombinator.com/item?id=38419647

the drawbacks [of laziness] are significant, performance, memory usage, and debugging can all be a real pain

https://news.ycombinator.com/item?id=35803884

In Haskell, laziness … can cause … nondeterministic performance, memory usage, GC pressure. “Space leaks” can also occur …

https://news.ycombinator.com/item?id=29192708

I believe this reputation significantly reduces the chance of curious people trying out Haskell, and improving this reputation would have massive benefits for the community as a whole.

Now, regarding the part of my comment that you quoted above, reputedly, the performance and memory usage of Haskell code is hard to reason able because of laziness per se. I am trying to challenge this by responding “no, it’s because of inappropriate laziness that you didn’t want and weren’t benefitting from anyway”.

So I don’t think I’m trying to split a hair, but I am trying to thread a needle. On one side are non-Haskellers who think that being lazy by default makes Haskell essentially unusable. On the other side are (a subset of) Haskellers who think that challenging anything to do with laziness is blasphemy (I’m not referring to anyone engaged in recent discussions here). I’m trying to take a middle path by suggesting that the whole discussion can be finessed by appropriate design of data types.

Yes, fair enough.

On this specific point, I developed strict-wrapper: Lightweight strict types to make exactly this kind of thing easier. By its nature there shouldn’t be any missing API functions, because the whole point is lightweight conversion to the lazy version. (I know for a fact there are missing instances though. I should add them.)

There’s a lot left unspecified there. Who are these beginners, how are they learning, what do they have to delay learning if they learn about deepseq instead (if anything)? It’s very hard to make a precise claim without knowing much more context. Instead, here’s a precise situation that I hope is easier to debate:

When Optimiser performance problems - #10 by emiruz was posted, it would have been most helpful to the poster, and to the community as a whole, if someone had replied

Yes, Data.Vector can store thunks. Use Data.Strict.Vector instead.

and the discussion about space leaks and laziness had ended there.

I claim yes, because it’s a simple solution that extends to the vast majority of situations that new users will encounter and it avoids overwhelming them with “an eye-watering number of methods” that will just confuse and disappoint them (and, I believe, lead to attrition from Haskell).

Regarding the Purescript situation, using force seems better than doing nothing, and also seems better than an invasive investigation if there aren’t the resources to perform that investigation. This suggests that it’s good for Haskellers to learn about and use force sometimes. Where we draw the line is open for debate. I made the claim above to help the debate.

1 Like

…by dispensing with laziness: “because laziness is a problem here, let’s just remove it”. So how do you prevent that from being generalised to:

  • because laziness is a problem […], let’s just remove it.

Data-Driven Ecosystem Migration: Non-Intrusive Migration of R Ecosystem from Lazy to Strict Semantics (2023)