What speed-up to expect for parallel program on 6 cores?

rubenmoor · January 20, 2022, 3:26pm

So it looks like I made a mistake regarding the flag “+RTS -A64M”, the allocation area size of the gc. I mistakenly concluded that, in general, A64M results in shorter runtime. Now I can’t reproduce this anymore, neither for parallel code nor for sequential code (I did quite a bit of refactoring in the meantime but none of my changes provide an obious explanation).

At some point I tweaked the -A parameter to that value and left it there in false belief.

The runtime statistics (+RTS -s) indicate “99.0% productivity” with -A64M vs. mere “74.5% productivity” w/o setting -A. There is considerably less gc activity in the former case: 0.9s elapsed time vs. 9.4s in the latter case, but overall run-time is still halved when I omit the -A64M.

Just omitting the -A flag, the speed-up looks way more favorable. It remains odd to me that the OS-process approach consistently outperforms the ghc threads, but they, at least, show consistent speed-up all the way to 11 jobs.

Topic		Replies	Views
Optimal -N for core count Learn	20	2294	September 12, 2021
Help: Haskell async behavior I don't understand Learn	19	927	May 23, 2025
Thoughts on monad-par Learn	10	549	January 23, 2025
GHC's `-j[<n>]` flag, useful enough to be a 'default'? Learn	18	1469	May 31, 2023
One Billion Row challenge in Hs Links	234	12337	July 14, 2024

What speed-up to expect for parallel program on 6 cores?

Related topics