Compact mutable `Int`

pedrominicz · March 30, 2023, 9:09pm

You can create a mutable Int using MutableByteArray#.

import GHC.Exts
import GHC.Types

data Ref = Ref# {-# UNPACK #-} !(MutableByteArray# RealWorld)

new :: Int -> IO Ref
new (I# i) = IO (\s -> case newByteArray# 8# s of
  (# s, a #) -> case writeIntArray# a 0# i s of
    s -> (# s, Ref# a #))

read :: Ref -> IO Int
read (Ref# a) = IO (\s -> case readIntArray# a 0# s of
  (# s, i #) -> (# s, I# i #))

write :: Ref -> Int -> IO ()
write (Ref# a) (I# i) = IO (\s -> case writeIntArray# a 0# i s of
  s -> (# s, () #))

But I think this representation is wastes a few bytes. Assuming MutableByteArray# works like ByteArray#, the memory layout is:

+--------+------+-----------------------+
| header | size |       payload         |
+--------+------+-----------------------+

But the size is always the same, so there is no need to have a size field. Looking at the GHC wiki, it seems like the size of a heap object and other useful information for the garbage collector is stored elsewhere in a info table that is pointed by the header. I assume the info table will tell the GC to look at the size field. I’d like for the info table to just say “the size of this object is X”.

I looked at some unboxed mutable reference implementations (e.g. unboxed-ref and primdata) and they are using MutableByteArray# so, if my reasoning is right, they also waste a few bytes of memory.

I have three questions:

How do I implement a mutable Int that doesn’t waste any memory?
Is my reasoning about the memory layout correct? More generally, how do I answer questions about the runtime memory layout this?
~~What is the canonical way of finding the size of Int in bytes? In the example above I just assumed it’s 8.~~ I found the answer: enable the C preprocessor, include MachDeps.h and use SIZEOF_HSINT.

jaror · March 30, 2023, 9:19pm

I don’t think it is possible, unfortunately
There is ghc-datasize: Determine the size of data structures in GHC's memory, and if you want more detail ghc-heap: Functions for walking GHC's heap.

pedrominicz · March 30, 2023, 10:27pm

I see, that is unfortunate.

ghc-{datasize,heap} look very interesting. Thanks for the recommendation!

wiz · March 31, 2023, 12:10pm

I understand the sentiment, but is this really important? If you need bulk mutable ints, then the size field overhead dwindles behind the mass of data itself.

waivio · March 31, 2023, 5:15pm

Seems pretty similar to the Storable instance of Int. With read and write sort of like peek and poke.

TerrorJack · March 31, 2023, 7:55pm

It’s completely undocumented but still possible: you can DIY your own heap objects in Cmm. You also need to write a few other Cmm functions to allocate your heap object in the current nursery, and to read/write its payloads, then use foreign import prim to import those functions into Haskell.

You may want to grep the INFO_TABLE macro and ALLOC_PRIM macros in the rts Cmm files for some existing examples.

EDIT: here’s a complete implementation: MutInt.hs · GitHub

Bodigrim · March 31, 2023, 11:07pm

@TerrorJack wow, today I learned! Is it deemed an unsafe or discouraged practice? I assume WASM will be fine, but JS not quite?

TerrorJack · March 31, 2023, 11:25pm

It’s safe And I won’t say it’s discouraged, it’s just that some RTS internal knowledge is required to write this kind of code.

Right, it works out of the box for the wasm backend, but not js.

adamse · March 31, 2023, 11:52pm

can you really call writing cmm safe? it’s hard to imagine a language with more footguns!

TerrorJack · April 1, 2023, 1:05am

You’re right, I omitted the “if you know what you’re doing suffix”. I meant “safe” in the sense that the written Cmm code indeed does what it’s supposed to do, but perhaps that notion of safety isn’t very meaningful in the first place.

Bodigrim · April 15, 2023, 1:55pm

For the record, @TerrorJack’s idea for mutable Int has now been implemented in atomic-counter/Counter.cmm at master · sergv/atomic-counter · GitHub.

jaror · April 15, 2023, 1:57pm

Would it make sense to split it into its own package?

Bodigrim · April 15, 2023, 2:12pm

Well, it is already a dedicated package providing a mutable Int and not much else.

jaror · April 15, 2023, 2:43pm

Ah, you’re right. The word ‘counter’ threw me off.

Mikolaj · April 19, 2023, 11:54am

How does it compare to unboxed-ref: Fast unboxed references for ST and IO monad (that I’m using)?

jaror · April 19, 2023, 11:58am

That also uses arrays under the hood:

data STRefU s a = STRefU {-# UNPACK #-} !(MutableByteArray s)

(source)

So the “DIY heap object” approach takes slightly less memory and is (therefore?) slightly faster. These benchmark results should apply, so it is around 4% faster on average.

Or maybe there is a difference between the atomic and non-atomic operations?

Mikolaj · April 19, 2023, 12:14pm

How about the counters implemented using fetchAddIntArray#, which is the bit that I’m using?

https://hackage.haskell.org/package/unboxed-ref-0.4.0.0/docs/Data-IORef-Unboxed.html#v:atomicAddCounter

jaror · April 19, 2023, 12:46pm

That function is exactly the function that the benchmark is comparing this new approach to.

Mikolaj · April 19, 2023, 1:40pm

Wow, you beat fetchAddIntArray#? Does it mean fetchAddIntArray# can be removed from GHC or is your approach applicable only to some use cases (pardon the ignorant questions, I’m just a hapless (and grateful) user)?

jaror · April 19, 2023, 1:45pm

fetchAddIntArray# as it’s name implies works over arrays while this custom C-- implementation works only on a single value. So it is not a full replacement.

Also note that I haven’t done any work on this; @TerrorJack and @Bodigrim did.

Topic		Replies	Views
How is memory for mutable variables managed? Learn	2	809	June 18, 2023
[Survey] Which low-level details would like to see documented?	13	632	May 22, 2024
Where are the two missing bits from `Int`? Learn	12	1782	November 1, 2023
In-place update in IO Learn	7	1087	February 17, 2021
Improving `memory` with better abstractions	16	826	June 30, 2025

Compact mutable `Int`

Related topics