GHC's `-j[<n>]` flag, useful enough to be a 'default'?

mpilgrem · May 27, 2023, 7:22pm

There is a 2018 issue in Stack’s repository (#4046) that discusses the merits of stack build making use of GHC’s -j[<n>] flag, perhaps ‘by default’. That has prompted me to look at the flag, and I have some questions where I would appeciate any help:

If GHC’s -j[<n>] flag was an unambiguously good thing for faster compiling, would it not be set by GHC by default (it is not)? Does that mean the use of the flag can have a downside?
In practice, related to point 1, do people who are conscious of the existence of GHC’s -j flag tend to set it as a matter of course or not?
Am I correct to understand that modern versions of Cabal (the tool) do not set GHC’s -j flag by default? Related to point 1, if not, why not?
-jN first appears in the documentation for GHC 7.8.1 (although it is not identified as something new). GHC 8.2.1 is the first to document “If N is omitted, then it defaults to the number of processors.”, but that is not identified as something new. Does anybody recall if that was a change in behaviour or simply an improvement in the documentation? (Stack aims to support back to GHC 7.10.3, at present, so I am interested in backward compatibility.)

mihaimaruseac · May 27, 2023, 11:10pm

One downside that I’ve seen in practice, both in Haskell and elsewhere, is that using -j requires more RAM, so on some system and workloads this would result in an OOM.

angerman · May 27, 2023, 11:28pm

I use -j a lot. But higher is not always better. It controls how many modules are built in parallel. This is infra-package module level parallelism. (Where the dependencies permit. -j4 is my default (it’s that in nix as well btw). Cabals -j flag controls package level parallelism (again if permitted by the dependency graph).

This means passing -j4 —ghc-option=-j4 to cabal can lead to 16 modules being compiled at the same time. Matthew Pickering recently wrote about the addition of -jsem which tries to control the quadratic explosion from the -j combination of the compiler and build tool. And I believe there were some other in-depth details on the -j flag. Maybe you can find that post on discourse here.

angerman · May 27, 2023, 11:31pm

Here’s the jsem doc: 1 jsem: parallelism semaphores for GHC — ghc-proposals documentation

atravers · May 27, 2023, 11:42pm

Another potential issue is the use of the multithreaded RTS - it still has some “rough edges”:

hasufell · May 28, 2023, 3:29am

It is my opinion that all tools should default to -j1.

I have expressed that opinion before:

Default configJobs value is unsafe/wrong · Issue #4296 · commercialhaskell/stack · GitHub
Default number of jobs is unsafe (cabal-install) · Issue #5776 · haskell/cabal · GitHub

In the past I’ve worked on projects where the dependency graph happened to have a couple of very memory straining packages like pandoc and amazonka. Building these in parallel is a disaster.

There’s no proper way with GHC to guess the memory usage during compilation. Number of cores is an irrelevant metric.

adamgundry · May 28, 2023, 7:55am

For GHC 9.8 and later, -jsem as @angerman mentioned should hopefully be a better strategy than separately guessing how many cores to use at both the inter-package and per-package level. It is being implemented in cabal-install (Add `--semaphore` flag to enable interaction with GHC Job Server… by mpickering · Pull Request #8557 · haskell/cabal · GitHub) and it would be great if Stack could support it too. Perhaps it shouldn’t be the default right away though, so we get some experience with it in practice first.

That does still leave the problem of how to decide whether the build is going to be memory-bound (and hence the number of parallel jobs may need to be restricted). Perhaps there could be a simple “parallel build” configuration flag that can be switched on or off, then regardless of which option is the default, users can be advised to configure it according to their needs.

wiz · May 28, 2023, 10:40am

I would like that when using my battlestation… But also I would like to avoid that when working on a laptop due to thermal reasons.

hasufell · May 28, 2023, 11:11am

Yes, and because you can’t really decide that at all, the default should be -j1. Relying on OOM killer is playing russian roulette with user data (which can get corrupted).

danidiaz · May 28, 2023, 11:21am

and because you can’t really decide that at all, the default should be -j1 .

Perhaps we could keep track of memory usage in earlier builds to better structure the concurrency of following ones. Like a form of profile-guided optimization, but for build plans.

atravers · May 28, 2023, 12:30pm

Is there some way to track which jobs have failed due to memory exhaustion during each part of the build? If so, those failed jobs could be rescheduled to run “one-by-one” after the other jobs in that part (the ones which did succeed without any intervention).

mpilgrem · May 28, 2023, 2:51pm

I would support Stack supporting GHC’s -jsem initiative, but I suspect Stack doing so (or, at least, doing so quickly) is beyond my own capability - even though the GHC proposal anticipates that the required changes to Stack would be ‘small and non-invasive’. I’ll raise an issue on Stack’s repository (#6131) and if anyone reads it (or this discussion) and is up for the challenge, their contribution to Stack’s development would be welcome.

mpilgrem · May 28, 2023, 3:41pm

I see that the proposition in 2018 – that the choice of default for Stack’s jobs configuration option, and the default configuration of Cabal (the tool)'s use of Cabal’s jobs configuration option (which otherwise defaults to 1), should prioritise ‘safe’ over ‘fast’ – fell on stony ground. In each case, the reasoning of people against the proposition seems to have been that the perceived demand for ‘fast’ was so great that it trumped that ‘fast’ could sometimes be ‘unsafe’ (even on machines with 8 cores and 16 GB of RAM). I won’t revisit that debate for Stack, but I will improve its online documentation about the possible risks of the current default.

hasufell · May 28, 2023, 4:05pm

This sounds truly terrible. Unless you use earlyoom, you very often won’t even get a clean OOM kill, but the kernel trying to swap and possibly freeze the machine for 15-30 minutes.

Source: that’s what happened to me regularly with GHC on large projects.

atravers · May 28, 2023, 6:05pm

…more “terrible” than:

…?

In the absence of GHC getting a great deal faster, users who can’t afford workstations (let alone “battlestations”!) will more and more turn to their only other convenient option and (try to) run more jobs simultaneously. So if even that is failing to work “regularly with GHC on large projects”…then this must surely now be the only hope left for GHC, at least in its current form:

Haskell Discourse: High-order Virtual Machine (HVM) an optional GHC-like runtime for rust with many comparisons to GHC

@hasufell, if what you are saying is correct - that not even -j2 can be relied upon to work properly in this era of multi-core/thread machines - it would most definitely help to explain why Haskell is still a “minority-user” language: because everyone else got tired of waiting…and waiting…and waiting…and waiting…and waiting…and went elsewhere, to find a language they could actually use (in much the same way GHC devs eventually got tired of waiting for darcs and decided to switch a more popular (but stupid) content manager which was faster to use).

So @hasufell…is GHC really that horrendous to use, even for someone as experienced as yourself?

hasufell · May 29, 2023, 4:21am

I don’t really understand your point.

As GHCup developer and contributor to a lot of other Haskell tooling I think I have somewhat of a picture of the average Haskell newbie.

Without trying to be insulting, I get bug reports from users who wonder why their installation is stuck since 30 minutes (they didn’t press ENTER, although it says so on the screen).

Given that I’ve had recurring issues with cabal/stack parallelism in a professional setting, I am speechless that we want to expose new users to more of these potential issues.

GHC memory consumption during compilation is unpredictable and there have been cases where we had to split an autogenerated module, because there was no machine that could compile a 7k LOC type module:

GitHub - capital-match/docusign-base: Low-level bindings to the DocuSign REST API.

Waiting for builds to crash and then adjust the build plan somehow seems to me like an absolute embarrassment if I would show Haskell to a new industrial user.

Also note that these days people are running HLS, which already has high memory consumption.

-jsem seems cool for power users, who are familiar with the memory footprint of their builds. But it doesn’t solve the fundamental problem.

atravers · May 29, 2023, 4:50am

I don’t really understand your point.

What if the OP of Stack issue #4046 was a new user at that time, who made the post out of frustration at having to manually adjust -j options themselves to improve build performance, a new user who then wondered why Stack didn’t use -jN by default.

Given that I’ve had recurring issues with cabal/stack parallelism in a professional setting, I am speechless that we want to expose new users to more of these potential issues.

They could also be exposing themselves to said issues, (again) out of frustration with their builds going slower than glaciers.

Waiting for builds to crash and then adjust the build plan somehow seems to me like an absolute embarrassment if I would show Haskell to a new industrial user.

Hrm:

Provided it’s a “build supervisor” process which is changing or reinterpreting the build plan, then the prospective industrial user shouldn’t be all that concerned. What would probably draw more unwanted attention these days are the various problems about GHC’s difficulties with using commodity multi-thread/core hardware, particularly for employees who are new to Haskell…

[…] it doesn’t solve the fundamental problem.

And if the advice to new users is that -j1 is the only safe way to build anything practical with Haskell, on machines which can obviously support more jobs…isn’t that also an
“absolute embarrassment”?

atravers · May 29, 2023, 9:40pm

In the absence of something like e.g. -mR where:

R is the fraction of memory GHC is using (e.g. 50% or 0.5);
if R is exceeded, no new jobs are started by GHC,

…then just passing in -j by itself will probably result in you having to deal with #issues about Stack builds crashing (as noted by @hasufell). As for -jsem…if the new jobserver is aware of the memory being used by all jobs and can react accordingly, then that would be a viable long-term option.

In the short term, (and as dreary as it it probably is to most) -j2 would seem to be the safest option:

an appreciable improvement to build throughput can be expected;
if both running jobs use enough memory to active the local OOM mechanism, it’s less likely to cause the problems @hasufell describes (e.g. presumably the build will stop more quickly);
if memory-intensive jobs are infrequent, then the build can still keep processing other “smaller”, jobs in the meantime.
There are probably still a (very) few 2-core or 2-thread machines out there being used for (smaller) Haskell builds;
With a change like this, it’s probably best to be conservative to begin with (e.g. noting the aforementioned “quadratic explosion” problem), allowing more later…much later, if needed.

TL;DR: Use -j2 with caution.

mpilgrem · May 31, 2023, 7:49pm

Many thanks for the various replies, which have helped me.