Is "Parallel Haskell" (the first part of Simon Marlow's "Parallel and Concurrent Haskell" 2013 book) still a thing?

Parallel haskell: Eval monad, Par monad, Repa, Accelerate (GPU programming).

Are these still “a thing”? Should they still be used (well maintained, etc.)?

I ask because these days, I mostly see the second part (MVar, TMVar, and all those reusable goodies) used all over the place, including used to achieve parallelism.

One barely sees the first part’s stuff mentioned anywhere so I wondered whether it fell into oblivion for some reason.

3 Likes

The “parallelism” in the book is specifically compute parallelism. It might be due to my background (web services and stuff), but I find compute parallelism is a bit less generally useful in industry.

I would guess that there are people relying on & benefiting from the parallelism features. I’d love to hear what for!

2 Likes

Repa is not very active, I’d suggest to use massiv instead.

Accelerate repository is active, but the latest Hackage release is quite outdated. At the very least it’s worth to be acquainted with the ideas behind Accelerate, they apply to other Haskell aproaches to GPU programming as well.

Long time lurker, first time poster. I’ve been working for a while on a repo using parallelism in computer algebra at https://github.com/DaveBarton/calculi. I’ve been able to get speedups of around 30x on the Buchberger algorithm for Gröbner Bases, surprisingly easily using Haskell, while I think no one else has successfully parallelized this algorithm at all. (There are other parallel algorithms for Gröbner Bases, “F4” and “F5” based, which I plan to work on next.) This is definitely of interest in the computer algebra community, and I’ve been invited to a workshop in late September where I’ll tell some of them about it. I planned to post here to ask you all for advice and suggestions.

Others have suggested that parallelism might be the (or a first) killer app for Haskell, which I personally agree with. In addition to our personal machines having 5-10+ performance cores in the next few years, researchers often have access to clusters of machines with 50-100 cores per NUMA node, and soon there will be more. I know some other Haskellers also have access to such machines. See the graph in this link, which actually shows the number of AMD and Intel performance cores per NUMA node. ARM is coming on strong also, and CPUs with 100-200 cores have been announced for 2024, costing around $10K per CPU in bulk. Of course you have to buy a whole system, and pay for electricity and maintenance, but it’s still easily affordable for research labs and universities, etc. There’s open source and proprietary software available for job submissions and scheduling, etc.

Of course, floating point supercomputers using GPUs/etc. have now crossed the “exascale” milestone, which is 10^18 flops! That’s a billion billion floating point operations per second! But this only works because of massive SIMD-ness, e.g. 50,000+ GPUs each containing thousands of hardware multipliers, etc. One has to avoid even using branch instructions much to get this to work. So obviously it’s not for general purpose programming. (I am not a real expert on this, if someone wants to correct me or explain it better, please do.)

I believe in functional programming, pure code, Hindley-Milner based type checking, etc., so that Haskell code can be safer, more modular, and more general than in other languages. But parallelism may be the way to convert the masses. Especially for shared-memory multiprocessing (as opposed to distributed or “grid” computing), Haskell really makes things easy.

Getting back to the original post, I have learned a lot of things about this. A quick summary is that for obviously (“embarrassingly”) parallel algorithms, I do use rpar and runEval at least from Control.Parallel.Strategies. But I think that book’s approach, and GHC sparks in general, try too hard to maintain determinism. To get competitive times for Gröbner Bases, I had to fork my own threads. It still was pretty easy. I did learn other things, such as on 30+ cores you need to keep laziness away from your shared data structures, gc does currently become a bottleneck, etc. I can/will say more about this, but this post seems long enough for now.

Finally, I’d like to mention and ask about the history of this. Apparently, Microsoft funded a Parallel GHC project in 2010-2012. Unfortunately, funding ended, Simon Marlow went to Facebook, work on extending the parallel garbage collector to use local heaps stopped, etc. But for CPU-intensive programming, I still believe parallelism is the biggest problem in computer science, and Haskell is the best way to solve it. I’d really like to hear from others with experience with this. (Sorry for the long note.)

13 Likes

rpar is still a great way to squeeze a little more performance if you know what you’re doing.

I second massiv for bulk processing. You can even use its scheduler for your own scatter-gather tasks.

Finally, there’s vulkan for GPU compute pipelines, although this needs more user-friendly wrapping.

5 Likes

Depends what you mean by “still a thing”, but they all still work, and they all get used. I used parallel only a few weeks ago to parallelise some OpenCV calls. Specifically we need to do some computer vision at a few specific regions on an image, so I slapped an parTraversable around that and my job was pretty much done!

5 Likes