Strict, StrictData, UNPACK

Forking off a thread from @vshabanov’s comment One Billion Row challenge in Hs - #223 by vshabanov


I don’t feel too strongly. Less experienced users might be better off with -XStrictData but then there’s also the risk they then don’t learn to use !.

Yes, and this is explained further in the User’s Guide. But firstly, I don’t think there’s much risk of UNPACK happening by accident, and secondly unboxing is not that closely related to laziness. There are morally two steps between something unboxed and something lazy: 1. unboxing, 2. lifting.

I don’t understand this. I expect that making fields of data types strict to be better for performance in the vast majority of cases. There are exceptions when you can avoid a redundant calculation by returning a value in a lazy field, but I expect such cases to not arise much in performance critical code anyway.

Do you have some examples you can share of strict fields making performance worse (where the worsening cannot be attributed to redundant computation)?

Who are they? For the record, I am not one. I think lazy function arguments are wonderful. My current expectation, though, is that strict fields of data types are the correct default choice in the vast majority of cases.

Despite valuing lazy function arguments, I haven’t been able to understand Ed’s insistence that what he wants can’t be achieved in a strict language with explicit laziness. Okasaki’s book was written in such a language, after all.

  • From page 26 of 33 in How to Declare an Imperative (1997)

  • From More points for lazy evaluation (2011)

I regularly have to dig into a large Haskell/Mu codebase. Seeing a data declaration and not knowing that it’s strict can be quite misleading. In general, having a LANGUAGE pragma somewhere invisibly affecting the code is not good for maintenance (but good for quick experimentation).

It explains that you need -O. But reboxing can happen with -O2. Not every function can be inlined. I’ve seen increases in allocations due to UNPACK.

I added it because it looks like a similar silver bullet that makes code more performant, but it doesn’t always help.

How much of the code is performance critical? I would say that for an average webapp, or maybe even a compiler, the majority of the code is not performance critical. And I would argue that redundant computation is the norm rather than the exception.

No (except when paired with -funbox-strict-fields), I’m mostly pointing that redundant computations are pretty frequent.

Another pitfall is using non-strict data as strict fields. field :: !(Maybe foo) won’t help much. One needs to use strict data all the way down, which is not that convenient.

I’ve seen several messages about strictness and performance in the original thread, and warned that it’s not as simple as making everything strict.

I think there is a single rule of thumb: “is it an accumulator? make it strict”.

Other cases are more nuanced. If it’s an aggregation – the result data is smaller than the source data AND we don’t want to keep the source data AND we are fine with always doing the aggregation – ok, either force evaluation or maybe make it strict.

If the data is created once and not updated recursively, the need for StrictData is not as strict, and the gain from not evaluating is greater.

YMMV, if you are dealing with numeric code, or have a codebase that is prone to space leaks for some reason, then you may need to enforce StrictData (and use StrictMaybe, StrictEither, etc). But this can make ergonomics worse.

There are alternative approaches like spine-strict data, deepseq (inefficient “traverse it twice” hammer, but can work when there’s no time to find a leak), seq at the point of creating a big thunk that is then put to a lazy field. Lots of them.

Good point. I have been programming in OCaml for several years. It’s strict and has lazy values, and I would say that programming in a lazy-by-default language feels much more pleasant (not sure how a lazy language with strict data would feel, though).

Ed’s take includes the modularity part as well. The performance part can be reproduced in a strict language (OCaml is faster than Haskell in many cases), but modularity (with performance) is more challenging to achieve.

1 Like