What is the state of the art on the interaction of laziness and parallelism? I don’t see much (any) talk about it.
Haskell has an exceptional concurrency model that is facilitated by purity, not laziness.
Laziness keeps you pure. But does laziness keep you parallel? I believe laziness keeps you from automating parallelization. Therefore, in order to achieve Automatic Parallel Haskell, you need to ditch Lazy Haskell (which is basically doing another language).
It kinda makes sense. Laziness demands you to defer as much as possible (you don’t want to do needless work). Parallelism demands you to be as eager as possible (you want to discard needless work and have it ready if you need it). If you could determine automatically if something should be run in parallel, laziness is out of the picture, because you will need it or it will fizzle out, no need to defer it.
Not only laziness can interfere with the parallelization of finite expressions, the infinite data structures it allows pose a challenge too. If you have a list of infinite expensive items, and the user will consume some of them, when do you stop generating them? When do you start generating them? How do you reconcile minimizing time spent and minimizing memory consumed to try to be as eager as possible but not explode memory with both thunks and optimistic thunks?
I think Haskell does alright with the current parallel automatization it has (using strictness analysis to spark to-be-used computations), and enables the creation of libraries that do parallelization for you (Accelerate, massiv, async, ki). Basically, you have the best of both worlds: you don’t have to declare the order of expressions and can reap the benefits of laziness, but if you want, you can, and you can compose them.
Maybe it’s a good thing that parallelization has to be made explicit. For example, for the HVM3 you can write both sequential and divide-and-conquer algorithms and you won’t notice the parallelization kicking in until you run it and compare the results. In a way, it is similar to writing lazy code and not figuring out the thunk leaks are there, until you run it and see the memory graph.
By providing the programmer as many shapes as possible to manually fit the problem in (or Strategies, as in Parallel), you are indeed explicitly helping the programmer write parallel code. This reminds me of Clojure and its keyword, recur. Clojure doesn’t do Tail-Call Optimization, but it gives you recur, to turn its caller into a loop. If you write a function with recur, and it’s the wrong shape, it won’t let you do it and it will tell you. If you write it correctly it will be allowed and it won’t blow the stack.
So if automating parallelism in a language crafted for that very reason is a tall order (because you need to make the shapes match otherwise it degenerates to a sequence), laziness is a thrown wrench with dynamite on top and gives you an impossible-to-make runtime with so much profiling it would make V8 and HotSpot blush.
Basically, I don’t see an Automatic Parallel Haskell, but Parallel Haskell is indeed here and it’s great (which makes Haskell worth recommending for commercial use, I didn’t want to derail the thread).