Problem with GHC's recompilation checker

I am having difficulty with GHC’s recompilation checker, which difficulty has its origins in a Stack issue. I thought I would document it here, in case anyone had any advice, or I have got the the wrong end of the stick.

I think the learning is: if you have a project where one local/mutable package depends on another, you need to either:

  1. clear past local build artefacts (eg with stack purge); or
  2. turn off the recompilation checker for all local packages, eg with Stack project configuration:
ghc-options:
  $locals: -fforce-recomp

or the recompilation checker may result in odd results in ways that are difficult to predict.

My test example is set out at GitHub - mpilgrem/ghc-recomp-test: Test of effect of GHC's recompilation checker. It is a simple two-package project (packageA, packageB). The packageA library exposes module LibA:

module LibA where

message :: IO ()
message = putStrLn "Message #1"

The packageB library depends on packageA and exposes module LibB:

module LibB (message) where
import LibA

packageB also provides executable progB, which depends on the packageB library:

module Main where

import LibB

main :: IO ()
main = message

Stack’s project configuration is straightforward:

resolver: lts-21.4 # GHC 9.4.5
packages:
- packageA
- packageB

If you build without optimisation and then run the executable:

> stack build --ghc-options -O0 --exec progB
... build output ...
Message #1

If you then change the source code in LibA.hs to change the message to, say, "Message #2", and do the same again, something unexpected happens (despite Stack asking for all the things affected by the change to be rebuilt):

> stack build --ghc-options -O0 --exec progB
... build output ...
Message #1

This unexpected result ("Message #1") does not occur if you build with Cabal’s default GHC optimisisation (-O). You get the expected "Message #2".

The unexpected result is because, without optimisation, GHC’s LibA.hi interface file does not change when that part of the source code changes, and so the ABI hash of the LibA module is the same. (With optimisation, the content of LibA.hi changes with that part of the source code, and the ABI hash changes.)

If you ask GHC to be verbose and informative (extracts only) …

> stack build --ghc-options "-O0 -v -ddump-hi-diffs" --exec progB
...
packageB> compile: input file src\LibB.hs
packageB> *** Checking old interface for LibB (use -ddump-hi-diffs for more details):
packageB> Considering whether compilation is required for LibB:
...
packageB> Checking interface for module LibA packageA-0.1.0.0-DHArZEtuWXUA5Cw1WZLOMV
packageB> Module fingerprint unchanged
packageB> [1 of 1] Skipping LibB
....
Message #1

… you can see GHC deciding to skip compiling LibB, because the ‘module fingerprint’ of LibA is unchanged. The fingerprint refers to the ABI hash. This can be seen in the corresponding output with optimisation:

> stack build --ghc-options "-v -ddump-hi-diffs" --exec progB
...
packageB> compile: input file src\LibB.hs
packageB> *** Checking old interface for LibB (use -ddump-hi-diffs for more details):
packageB> Considering whether compilation is required for LibB:
...
packageB> Checking interface for module LibA packageA-0.1.0.0-DHArZEtuWXUA5Cw1WZLOMV
packageB>   Module fingerprint has changed 56d9247b6388186e749be6d9dd24e909 -> 585094af0aff42c5ce46c89eb1b01b37
packageB> [1 of 1] Compiling LibB [LibA changed]
...
Message #2

(The DHArZEtuWXUA5Cw1WZLOMV is the ‘installed package ID’ chosen for packageA-0.1.0.0 by Cabal (the library). It also does not change when the source code of LibA.hs changes.)

According to the output of GHC’s --show-iface mode, LibA.hi records that the source code has changed (via a different src_hash). With "Message #1" you have (extracts):

❯ stack build --ghc-options -O0 --exec "ghc --show-iface D:\Users\mike\Code\GitHub\ghc-recomp-test\packageA\.stack-work\dist\c3556505\build\LibA.hi"
...
interface LibA 9045
  interface hash: 6270e3f4dda3678774feb1579a397a8e
  ABI hash: 22af1d8c76150daffaa01a4c79158904
...
  src_hash: e6301e8ba0fd9165e1c30aee5f55d4ac
...

and with "Message #2" you have (extracts):

❯ stack build --ghc-options -O0 --exec "ghc --show-iface D:\Users\mike\Code\GitHub\ghc-recomp-test\packageA\.stack-work\dist\c3556505\build\LibA.hi"
...
interface LibA 9045
  interface hash: 11cce6383dd95d775e45377c8d52fc40
  ABI hash: 22af1d8c76150daffaa01a4c79158904
...
  src_hash: c7b2150949b29027b7581ddca94687a0
...

I find it a little odd that GHC chooses not to compile LibB.hs when it knows (presumably) that the source code of LibA has changed (even if its ABI hash has not). I can see that not all source code changes may be substantive, but I would have thought it safer to assume that if the source code had changed, something important could be different.

5 Likes

That’s the point of the ABI hash: It hashes exactly all those things that could affect the compilation of LibB. Once you link the generated object files together it should just work™.

I wonder if ghc’s or stack’s recompilation logic is to be blamed here. It would be good to see the exact ghc commands that stacks executes here.

Unfortunately, the ABI hash for module LibA does not necessarily hash all the things that could affect the compilation of LibB - that could be said to be the source of the problem described above. The source code of LibA.hs can change in substantive ways, but the ABI hash for the module is unaffected. In the example, the source code changes from message = putStrLn "Message #1" to message = putStrLn "Message #2" without the ABI hash changing.

The above is not a problem if GHC optimisation is invoked (which Cabal (the library) does by default), because then LibA.hi includes (extract only):

...
632aa92c13d59a48dcdb6f2b7cd01fe5
  message3 :: GHC.Prim.Addr#
  [HasNoCafRefs, LambdaFormInfo: LFUnlifted,
   Unfolding: ("Message #1"#)]
...

That is, a little bit of the LibA.hs source code (the relevant bit) also appears in LibA.hi.

1 Like

Are you sure that can affect the compilation of LibB? I don’t think that is necessarily the case. GHC could simply compile LibB to be an indirection that points to LibA. That indirection could remain valid even if the content of the message changes.

1 Like

On GHC commands, Stack actually invokes Cabal (the library) and then Cabal invokes GHC. One can get the full story (Stack’s, Cabal’s and GHC’s) with:

stack --verbose build --ghc-options "-O0 -v -ddump-hi-diffs" --cabal-verbosity=deafening --exec progB

(Cabal uses MIDL-style response files and its ‘deafening’ is required to inspect the contents of those files.)

Cabal’s command to GHC 9.4.5 for the making of LibB is as follows (reformatted and annotated with my comments #):

"C:\Users\mike\AppData\Local\Programs\stack\x86_64-windows\ghc-9.4.5\bin\ghc-9.4.5.exe" 
"--make" # Using GHC's --make mode 
"-fbuilding-cabal-package" # Tell GHC a Cabal package is being built
"-O" # Cabal's default optimisation, overriden later on (see below)
"-outputdir" ".stack-work\dist\c3556505\build" 
"-odir"      ".stack-work\dist\c3556505\build" 
"-hidir"     ".stack-work\dist\c3556505\build" 
"-stubdir"   ".stack-work\dist\c3556505\build" 
"-i" # Clear GHC's search path
"-i.stack-work\dist\c3556505\build" 
"-isrc" 
"-i.stack-work\dist\c3556505\build\autogen" 
"-i.stack-work\dist\c3556505\build\global-autogen" 
"-I.stack-work\dist\c3556505\build\autogen" 
"-I.stack-work\dist\c3556505\build\global-autogen" 
"-I.stack-work\dist\c3556505\build" 
"-IC:\Users\mike\AppData\Local\Programs\stack\x86_64-windows\msys2-20230526\mingw64\include"
"-optP-include" "-optP.stack-work\dist\c3556505\build\autogen\cabal_macros.h" 
"-this-unit-id" "packageB-0.1.0.0-BDlnYQTGe6fLVPLNizclSi" 
"-hide-all-packages" 
"-Wmissing-home-modules" 
"-no-user-package-db" 
"-package-db" "C:\sr\snapshots\a76b5152\pkgdb" 
"-package-db" "D:\Users\mike\Code\GitHub\ghc-recomp-test\.stack-work\install\c4ae365c\pkgdb" 
"-package-db" ".stack-work\dist\c3556505\package.conf.inplace" 
"-package-id" "base-4.17.1.0" 
"-package-id" "packageA-0.1.0.0-DHArZEtuWXUA5Cw1WZLOMV" 
"-XHaskell2010" 
"LibB" 
"-O0" # Reset the optimisation to -O0. This is the only thing that changes between 'working' and 'failing' 
"-v"
"-ddump-hi-diffs" 
"-fhide-source-paths" 
"-fdiagnostics-color=always"
2 Likes

On changing the source code of LibA affecting the compilation of LibB, empirically, if GHC is forced to recompile LibB (by -fforce-recomp), the behaviour changes - the change from ”Message #1” to ”Message #2” is only then reflected in the executable’s output. (To be clear, GHC compiles LibA each time its source code changes - I think LibB cannot be only a ‘conduit’. For completeness, when GHC skips compiling LibB, it then also skips compiling Main and skips linking to create the executable progB.)

1 Like

What about the ghc command to build the executable for Main?

On GHC commands and Main.hs, there are two (reformatted and annotated with my comments #):

First, omitting the linking phase:

"C:\Users\mikep\AppData\Local\Programs\stack\x86_64-windows\ghc-9.4.5\bin\ghc-9.4.5.exe" 
"--make" 
"-no-link" # Omit the linking phase
"-fbuilding-cabal-package" 
"-O" 
"-static" 
"-outputdir" ".stack-work\dist\c3556505\build\progB\progB-tmp" 
"-odir"      ".stack-work\dist\c3556505\build\progB\progB-tmp" 
"-hidir"     ".stack-work\dist\c3556505\build\progB\progB-tmp" 
"-stubdir"   ".stack-work\dist\c3556505\build\progB\progB-tmp" 
"-i" 
"-i.stack-work\dist\c3556505\build\progB\progB-tmp" 
"-iapp" 
"-i.stack-work\dist\c3556505\build\progB\autogen" 
"-i.stack-work\dist\c3556505\build\global-autogen" 
"-I.stack-work\dist\c3556505\build\progB\autogen" 
"-I.stack-work\dist\c3556505\build\global-autogen" 
"-I.stack-work\dist\c3556505\build\progB\progB-tmp" 
"-IC:\Users\mikep\AppData\Local\Programs\stack\x86_64-windows\msys2-20230526\mingw64\include" 
"-optP-include" "-optP.stack-work\dist\c3556505\build\progB\autogen\cabal_macros.h" 
"-hide-all-packages" 
"-Wmissing-home-modules" 
"-no-user-package-db" 
"-package-db" "C:\sr\snapshots\21152fbc\pkgdb" 
"-package-db" "C:\Users\mikep\Documents\Code\GitHub\ghc-recomp-test\.stack-work\install\4dc55535\pkgdb" 
"-package-db" ".stack-work\dist\c3556505\package.conf.inplace" 
"-package-id" "base-4.17.1.0" 
"-package-id" "packageB-0.1.0.0-BDlnYQTGe6fLVPLNizclSi" 
"-XHaskell2010" 
"app\Main.hs" 
"-O0" 
"-v" 
"-ddump-hi-diffs" 
"-fhide-source-paths" 
"-fdiagnostics-color=always"

Second, as above but not omitting the linking phase:

"C:\Users\mikep\AppData\Local\Programs\stack\x86_64-windows\ghc-9.4.5\bin\ghc-9.4.5.exe" 
"--make" 
"-fbuilding-cabal-package" 
"-O" 
"-static" 
"-outputdir" ".stack-work\dist\c3556505\build\progB\progB-tmp" 
"-odir"      ".stack-work\dist\c3556505\build\progB\progB-tmp" 
"-hidir"     ".stack-work\dist\c3556505\build\progB\progB-tmp" 
"-stubdir"   ".stack-work\dist\c3556505\build\progB\progB-tmp" 
"-i" 
"-i.stack-work\dist\c3556505\build\progB\progB-tmp" 
"-iapp" 
"-i.stack-work\dist\c3556505\build\progB\autogen" 
"-i.stackwork\dist\c3556505\build\global-autogen" 
"-I.stack-work\dist\c3556505\build\progB\autogen" 
"-I.stack-work\dist\c3556505\build\global-autogen" 
"-I.stack-work\dist\c3556505\build\progB\progB-tmp" 
"-IC:\Users\mikep\AppData\Local\Programs\stack\x86_64-windows\msys2-20230526\mingw64\include" 
"-optP-include" "-optP.stack-work\dist\c3556505\build\progB\autogen\cabal_macros.h"
"-LC:\Users\mikep\AppData\Local\Programs\stack\x86_64-windows\msys2-20230526\mingw64\lib" 
"-LC:\Users\mikep\AppData\Local\Programs\stack\x86_64-windows\msys2
-20230526\mingw64\bin" 
"-hide-all-packages" 
"-Wmissing-home-modules" 
"-no-user-package-db" 
"-package-db" "C:\sr\snapshots\21152fbc\pkgdb" 
"-package-db" "C:\Users\mikep\Documents\Code\GitHub\ghc-recomp-test\.stack-work\install\4dc55535\pkgdb" 
"-package-db" ".stack-work\dist\c3556505\package.conf.inplace" 
"-package-id" "base-4.17.1.0" 
"-package-id" "packageB-0.1.0.0-BDlnYQTGe6fLVPLNizclSi" 
"-XHaskell2010" 
"app\Main.hs" 
"-o" ".stack-work\dist\c3556505\build\progB\progB.exe" # Specify the executable as output
"-O0" 
"-v" 
"-ddump-hi-diffs" 
"-fhide-source-paths" 
"-fdiagnostics-color=always"
3 Likes

The last bit is the bug: Skipping the linking.

It is ok and correct to not compile the module LibB and Main (in -O0), because LibA’s .hi file is unchanged and the LibB.o and Main.o file produced would be unchaged. Recompiling LibA should have changed LibA.o, though with the new message, and then linking them all together should propagate the new behavior into the .exe. It’s actually pretty neat that you don’t have to rebuild LibB and Main, isn’t it?

What I can’t see right away is why the linking step doesn’t happen when it should.

3 Likes

It sounds like the problem here might be that 9.4.5’s recompilation checker has been improved to avoid some unnecessary recompiles, but that could have exposed existing flaws in the relinking check, which were previously not noticed because of the recompilation checker being overly conservative. Looks like there a few possible suspects:

Perhaps having an option to override the relink check would be a good idea:

6 Likes

I looked into this quickly and it seems that the problem is that we only check the modification times of .a files for the direct package dependencies rather than the transitive dependencies.

I opened a ticket to track the issue: #23724: Linking recompilation checking only checks modifiation time of direct package dependencies · Issues · Glasgow Haskell Compiler / GHC · GitLab

9 Likes

On GHC skipping linking, the following GHC output may be of interest:

❯ stack build --ghc-options "-O0 -v -ddump-hi-diffs" --exec progB
...
packageB> compile: input file app\Main.hs
packageB> *** Checking old interface for Main (use -ddump-hi-diffs for more details):
packageB> Considering whether compilation is required for Main:
packageB> Module flags unchanged
packageB> Optimisation flags unchanged
packageB> HPC flags unchanged
packageB> signatures to merge in unchanged
packageB> []
packageB> implementing module unchanged
packageB> Checking interface for module Prelude base
packageB> Module fingerprint unchanged
packageB> Checking interface for module GHC.Types ghc-prim
packageB> Module fingerprint unchanged
packageB> Checking interface for module LibA packageA-0.1.0.0-DHArZEtuWXUA5Cw1WZLOMV
packageB> Module fingerprint unchanged
packageB> Checking interface for module LibB packageB-0.1.0.0-BDlnYQTGe6fLVPLNizclSi
packageB> [1 of 2] Skipping Main
packageB> Module fingerprint unchanged
packageB> *** Deleting temp files:
packageB> Deleting:
packageB> link: linkables are ...
packageB> [2 of 2] Skipping .stack-work\dist\c3556505\build\progB\progB.exe
packageB> LinkableM (2023-07-25 11:00:26.8815712 UTC) Main
packageB>    [DotO .stack-work\dist\c3556505\build\progB\progB-tmp\Main.o]
packageB> .stack-work\dist\c3556505\build\progB\progB.exe is up to date, linking not required.

On the changes made in GHC 9.4.5, GHC 9.2.8 does behave differently (and as you would expect, in terms of output). Specifically, with GHC 9.2.8 the final linking is not skipped:

resolver: lts-20.26 # GHC 9.2.8
packages:
- packageA
- packageB
❯ stack build --ghc-options "-O0 -v -ddump-hi-diffs" --exec progB
...
packageA> compile: input file src\LibA.hs
packageA> *** Checking old interface for LibA (use -ddump-hi-diffs for more details):
packageA>     Source file changed or recompilation check turned off
packageA> [1 of 1] Compiling LibA
...
packageB> compile: input file src\LibB.hs
packageB> *** Checking old interface for LibB (use -ddump-hi-diffs for more details):
packageB> Considering whether compilation is required for LibB:
...
packageB> Checking innterface for module LibA
packageB> Module fingerprint unchanged
packageB> [1 of 1] Skipping  LibB
...
packageB> compile: input file app\Main.hs
packageB> *** Checking old interface for Main (use -ddump-hi-diffs for more details):
packageB> Considering whether compilation is required for Main:
...
packageB> Checking innterface for module LibA
packageB> Module fingerprint unchanged
packageB> Checking innterface for module LibB
packageB> Module fingerprint unchanged
packageB> [1 of 1] Skipping  Main
...
packageB> compile: input file app\Main.hs
packageB> *** Checking old interface for Main (use -ddump-hi-diffs for more details):
packageB> Considering whether compilation is required for Main:
...
packageB> Checking innterface for module LibA
packageB> Module fingerprint unchanged
packageB> Checking innterface for module LibB
packageB> [1 of 1] Skipping  Main
packageB> Module fingerprint unchanged
packageB> Upsweep completely successful.
packageB> *** Deleting temp files:
packageB> Deleting:
packageB> Linking .stack-work\dist\6f82b39a\build\progB\progB.exe ...
packageB> link: linkables are ...
packageB> LinkableM (2023-07-25 11:13:27.5580171 UTC) Main
packageB>    [DotO .stack-work\dist\6f82b39a\build\progB\progB-tmp\Main.o]
1 Like

Ah great, so it was a likely GHC bug, but one that is fixed now? That’s a good outcome.

1 Like

I believe we’re running into this on GHC 9.4.5. Changes in transitive packages not explicitly depended upon don’t trigger re-linking. I can see the modules get rebuilt, but they don’t get relinked, which seems to be consistent with the analysis here.

Good work tracking this all down!

3 Likes

Well, it’s not fixed yet, but at least it’s been identified.

1 Like

Is this why 9.2.8 is still the recommended GHC in GHCup? Or is there another major issue that I am unaware of?

@mpickering,

Did your fix go into the latest GHC release? When do you expect this to get released?

Thanks for your hard work here!

1 Like

It looks like it should be in 9.6.3 and will also be in 9.8.1 when it is released.

1 Like

The recommended GHC in GHCup is decided by the GHCup maintainers. All the releases are produced to the same standards.

1 Like