What speed-up to expect for parallel program on 6 cores?

rubenmoor · January 21, 2022, 2:11pm

so the issue with -A64m seems relatively clear to me. The default value of -A is 1 Megabyte (-A1m) and I checked cache size of my CPU, AMD Ryzen 5 3600:

Cache L1:	64K (per core)
Cache L2:	512K (per core)
Cache L3:	32MB (shared)

I guess the default in the ghc settings is oriented to common hardware. If I increase the allocation area size to much more than 4 MB, the main memory will be used instead of cache. The allocation area size is per job, i.e. “+RTS -N10 -A4m” implies 40 MB in total.

At around -A64m, gargabe collecting efficiency is highest, but it all takes place in main memory. At around -A64m, gc efficiency is relatively low, but apparently counterweighted by way faster memory access.

The same behavior can be observed on my laptop with intel cpu.

I will present the minimal example shortly, here, so others will be able to quickly reproduce the issue.

Topic		Replies	Views
Optimal -N for core count Learn	20	2294	September 12, 2021
Help: Haskell async behavior I don't understand Learn	19	927	May 23, 2025
Thoughts on monad-par Learn	10	549	January 23, 2025
GHC's `-j[<n>]` flag, useful enough to be a 'default'? Learn	18	1469	May 31, 2023
One Billion Row challenge in Hs Links	234	12337	July 14, 2024

What speed-up to expect for parallel program on 6 cores?

Related topics