Speeding Up Compiling

Anyone have success in dramatically speeding up ghc builds?

I work on a remote team and am wondering if there is a way to have a server assist in compiling times.

Has anyone tried using a server to build the dependencies with cabal and share the results with nfs. The remotes hosts could then mount the nfs readonly and then use unionfs-fuse so that they can make local changes if needed.

This isn’t the ideal solution.

Any thoughts?

1 Like

I haven’t tried this, but in principle using Bazel with cloud build allows people to share compilation artifacts such that if a member of the team has compiled something, Bazel will download that artifact rather than having you compile it.

Bazel supports Haskell, but as I understand it, this is a rather large hammer to wield

1 Like

Maybe cabal-cache: CI Assistant for Haskell projects is the sort of thing you’d want to use as part of that.

But ultimately if your compile times are slow, the thing to do is figure out why and address the problem. Most slow compile time issues are addressable and can be avoided. The main issue tends to be Generics.

2 Likes

It’s waiting 15 mins for all the deps

Also, using a custom mtl doesn’t help

but the deps should only get built once for the project and not again, unless they change?

3 Likes

(I’m going to sound like a fuddy-duddy, I know, but I’m getting rather tired of people claiming something is slow without giving any timings or sizings or hardware details whatsoever. o.p. here is in effect asking ‘how slow is a piece of string?’.)

Any build time for a large project that you can measure in minutes is quick. I’d expect a full build to take all night. So (as @sclv says) that should build all the deps; then during your working day it needs only an incremental build. Which’ll give you time to go get a coffee/sandwich/sharpen your coding pencils.

Focusing on Generics is a micro optimization.

Dependency graphs and (lack of) boundaries are what really bite you at the 1000s of modules scale from what I’ve seen.

Both Bazel and Nix benefit from proper boundaries for good caching. The sub-package caching of Bazel is helpful but also more in micro optimization land imo - and even with that, a need dependency graph is going to pay off.

Ideally, no package should have to get that large, but that takes discipline and care to do. If you let it run wild without valuing boundaries and interfaces, it’s a lot harder to fix after the fact.

It’s easier to do the dishes as you cook than deal with a full sink :slight_smile:

2 Likes

Oh one practical bit of advice:

--repl-no-load is a great default for larger projects. Then you don’t have to build everything. Just what you need.

2 Likes

Maybe a little bit outdated but I think still good resources:
https://www.parsonsmatt.org/2019/11/27/keeping_compilation_fast.html
http://rybczak.net/2016/03/26/how-to-reduce-compilation-times-of-haskell-projects/
Hardware aspect:

For bigger binaries, if linking takes more time, I would also consider moving .stack or .cabal files to ram disk - I have no solid proof it will help.

2 Likes

Have you tried compiling in -O0 for dev builds?

If performance of the resulting app is an issue, try -O1 -fomit-iface-pragmas, which optimises modules locally but does not try to inline/specialise/optimise across module boundaries. This should also help with recompilation times (kudos to Matthew Pickering for this tip). It is likely that it also helps with Generics compile-time perf (at the expense of hideous runtime perf).

Whether build artefact caching is useful depends on your use case.
Is this about initial compile times for a fresh checkout? Then build artefact caching might be worthwhile.
Is this about recompilation times when developing a new feature in your fork of mtl (say)? Then build artefact caching isn’t all that useful. After all, you will need to rebuild all importers of mtl. -fomit-interface-pragmas should help not needing to rebuild any indirect importers, though, which should make a huge difference.

4 Likes

Module/package structure is definitely also important to optimize, but that doesn’t doesn’t diminish the importance of getting individual module compile times down.

Both are important and whether one or the other is the limiting factor will vary from project to project and over time. If you have a long linear chain of modules, you can get huge benefits from breaking up the chain and benefiting from parallelism and fewer rebuilds. If you have one very slow to rebuild module that often gets rebuilt, then focusing on that is helpful.

That’s fair, and stuff like Generics, TH, fancy type level stuff, etc can definitely be expensive.

But this is really where a nice graph comes in! Generics for example. It is used for types. How often does a type change? Not as often as code. So isolate the types in their own modules. Now they get cached and only rebuilt when changed. And each type module is built in parallel so even on full rebuild it isn’t as bad.

Usually the issue with the fancy stuff is using a bunch of it in a big file. Large modules in general are slower than many small modules for GHC to compile.

So if you follow this, you now don’t have to program defensively against compile times. Which is kinda silly right? Generics is super cool and useful and elegant, and I’ve seen lots of people cargo cult and ignore it wholesale because of the base, low level concern of compile time. I don’t like to let my tools’ weaknesses drive my thoughts and design.

4 Likes

I’ll add one more tip – for certain types of generics-heavy code (e.g. Rel8 instances) we’ve discovered that adding the following pragma to the files that contains it (or passing these as flags to the whole project) can speed things up drastically:

{-# OPTIONS_GHC -fignore-interface-pragmas #-}
{-# OPTIONS_GHC -fomit-interface-pragmas #-}
7 Likes