Application Memory limit 1TB

A research application that my lab has created seems to hit a limit at 1TB of memory use (out of memory). The machine(s) I am running on all have > 1TB physical memory. I have checked ulimit and memory use (virtual and max) are both “unlimited”. I was wondering if when the physical memory required hits the Virtual memory allocated (1TB) the “out of memory” is triggered. I have looked, but not found an option to increase the amount of virtual/physical memory available to an application (as was increased in ghc-8.0). The application is used for research on large genomic data sets.

Any suggestions?

9 Likes

Interesting! I also used to work on >1TB machines and wondered how ghc would handle it.

5.7. Runtime system (RTS) options — Glasgow Haskell Compiler 9.8.1 User's Guide is one option for controlling max memory usage, but it defaults to “unlimited”. Just linking it for context.

Can you post the output of your program when the error occurs?

1 Like

Thanks–the program is “phyg” and I get:

phyg: out of memory

Hm, that’s not a lot to go on.

You mentioned ulimit- I assume Linux?

How are you measuring/monitoring memory usage?

Yes Linux.

Using “top” to watch memory consumption so can see it rise to 1TB then fail.

One scenario could be that the VM is set on startup at 1TB and
even though other settings are unlimited, the VM setting then acting like
a system limit.

Indeed I was wondering when someone would hit this limit. The problem is that GHC uses a static 1 TB address space reservation for its heap. Sadly, there is currently no way to affect the size of this reservation at runtime. However, I believe that a patch like the following would suffice in raising the limit (but sadly I have no means of testing this):

diff --git a/rts/sm/MBlock.c b/rts/sm/MBlock.c
index 6eb33753045..fe56d50de02 100644
--- a/rts/sm/MBlock.c
+++ b/rts/sm/MBlock.c
@@ -663,7 +663,7 @@ initMBlocks(void)
 #if defined(aarch64_HOST_ARCH)
         size = (W_)1 << 38; // 1/4 TByte
 #else
-        size = (W_)1 << 40; // 1 TByte
+        size = (W_)16 << 40; // 1 TByte
 #endif
         void *startAddress = NULL;
         if (RtsFlags.GcFlags.heapBase) {
10 Likes

Wow thanks–I can try–haven’t build GHC myself before–just used GHCUP

Interesting! Why is the limit lower on aarch64?

So–I was able to modify and build ghc. ghc itself was setting up 16T of vm as hoped.

When building my app (phyg; git@github.com:amnh/PhyG.git), there was a link error (below).

Did I not fully build or configure correctly?

My compile line used “–with-compiler ~/home/ghc/_build/stage1/bin/ghc”

[62 of 62] Linking /home/ward/home/PhyG/dist-newstyle/build/x86_64-linux/ghc-9.9.20240101/PhyG-0.1.4/x/phyg/opt/build/phyg/phyg
/home/ward/.cabal/store/ghc-9.9.20240101/atomic-primops-0.8.4-4134d138778778a42a2da4976bc340844b2df4d5ee2bc70e095715fd5dcd5958/lib/libHSatomic-primops-0.8.4-4134d138778778a42a2da4976bc340844b2df4d5ee2bc70e095715fd5dcd5958.a(Atomics.o):function atomiczmprimopszm0zi8zi4zm4134d138778778a42a2da4976bc340844b2df4d5ee2bc70e095715fd5dcd5958_DataziAtomics_storeLoadBarrier_info: error: undefined reference to ‘store_load_barrier’
/home/ward/.cabal/store/ghc-9.9.20240101/atomic-primops-0.8.4-4134d138778778a42a2da4976bc340844b2df4d5ee2bc70e095715fd5dcd5958/lib/libHSatomic-primops-0.8.4-4134d138778778a42a2da4976bc340844b2df4d5ee2bc70e095715fd5dcd5958.a(Atomics.o):function atomiczmprimopszm0zi8zi4zm4134d138778778a42a2da4976bc340844b2df4d5ee2bc70e095715fd5dcd5958_DataziAtomics_loadLoadBarrier_info: error: undefined reference to ‘load_load_barrier’
/home/ward/.cabal/store/ghc-9.9.20240101/atomic-primops-0.8.4-4134d138778778a42a2da4976bc340844b2df4d5ee2bc70e095715fd5dcd5958/lib/libHSatomic-primops-0.8.4-4134d138778778a42a2da4976bc340844b2df4d5ee2bc70e095715fd5dcd5958.a(Atomics.o):function atomiczmprimopszm0zi8zi4zm4134d138778778a42a2da4976bc340844b2df4d5ee2bc70e095715fd5dcd5958_DataziAtomics_writeBarrier_info: error: undefined reference to ‘write_barrier’
collect2: error: ld returned 1 exit status
ghc: gcc' failed in phase Linker’. (Exit code: 1)
Error: cabal: Failed to build exe:phyg from PhyG-0.1.4.

Thanks

1 Like

@ward I think it would be more effective for the GHC devs to have a shell on your workstation than debugging over forum :upside_down_face:

@ward remember you can sandwich code blocks between ``` to have them rendered as fixed-width.

1 Like

I guess you shouldn’t try to build ghc-9.9. Maybe find the tag of the ghc-9.8.1 release and build that?

Good idea–ghc-9.8.1 built fine but I get a ghc panic when compiling my app (does not happen with
pre-built ghc-9.8.1 or ghc-9.9.20240101):

ghc: panic! (the ‘impossible’ happened)
GHC version 9.8.1:
ModOrigin: hidden module redefined
x: unusable module
y: unusable module
Call stack:
CallStack (from HasCallStack):
callStackDoc, called at compiler/GHC/Utils/Panic.hs:191:37 in ghc-9.8.1-inplace:GHC.Utils.Panic
pprPanic, called at compiler/GHC/Unit/State.hs:239:14 in ghc-9.8.1-inplace:GHC.Unit.State
CallStack (from HasCallStack):
panic, called at compiler/GHC/Utils/Error.hs:503:29 in ghc-9.8.1-inplace:GHC.Utils.Error

Please report this as a GHC bug: report a bug · Wiki · Glasgow Haskell Compiler / GHC · GitLab

The error I get during compilation of ghc-9.9.20241010 is in atomic.c

libraries/ghc-prim/cbits/atomic.c:175:10: error:
     note: ‘__sync_fetch_and_nand’ changed semantics in GCC 4.4
      175 |   return __sync_fetch_and_nand((volatile StgWord8 *) x, (StgWord8) val);
          |                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~