I’ve been trying to make my rendering code a bit faster with wide ops, but everything I try ends up slower than the dumbest straightforward variant. E.g. pointwise (+) for Vec4 ByteArray# is a tiny bit faster with FFIng, but loses by a third (!) to Vec4 Float Float Float Float and then using regula…

How to cook with SIMD?

atravers August 25, 2024, 9:09pm 2

Out of a (web)search for haskell ghc simd poor performance, this:

seems to be the most recent of the first six “vaguely-relevant” results - does anything there help?

Topic		Replies	Views
Optimiser performance problems Learn	31	2336	April 30, 2024
Some thoughts on GHC's SIMD primitives Links	7	633	January 17, 2025
[ANN] vector-hashtables-0.1.0.1 Announcements	4	535	September 10, 2021
Is "Parallel Haskell" (the first part of Simon Marlow's "Parallel and Concurrent Haskell" 2013 book) still a thing?	5	1132	July 11, 2023
New blog post: Is Haskell fast? Links	10	3105	December 2, 2020