I’ve long been of the mind that memory
needs to be improved. Due to concern about the library itself, it may be an opportune time to examine this library, and consider what we can do to improve it. Haskell has done a great job of pollenating other languages with concepts, and I have spent some time recently doing a lot of C, with some dabbling in Rust and Zig, all with a focus on memory management, allocators, alignment, and other dark arts that often left to the compiler in functional languages.
I think it is actively harmful to the community and to the language to consider such things as ‘beneath us’, since these things are usually considered an implementation detail that should never be visible. But think of what could be done with explicit control over allocators - what if we had a HasAllocator
like we have HasCallstack
? What if we could actually run a given function with a different allocator, and could guarantee immediate memory cleanup when it was done? Zig passes allocators explicitly, we have many options to hide them away as implicit arguments or behind monads. I think there is something to be learned, and brought back to Haskell - doubly so the rise of alternative compilers such as MicroHs, and the desire to compile efficiently for more architectures like embedded systems and WebAssembly.
To that end I have extracted the public API of memory
for assessment and it consists of some 170-ish functions, classes, data types, and constructors. Not a small surface, to be sure, but not an insurmountable foe. Of course, this does not include the implementation code, which is another beast altogether - so we should take the time to seek to understand why certain things were implemented so. Also, some time ago I did a reverse dependency search to see how memory
is actually being used (its here in some thread somewhere) and it turned out that a majority of use cases were limited to the same code - the ByteArray/Access
class, the typely-sized variant ByteArrayN/SizedByteArray
, and base encoding functions.
Anyway, lets get started.
The first thing worth considering is that the main class hierarchy - the ByteArrayAccess
and ByteArray
classes - are badly named, and do not properly separate their concerns anway. ByteArrayAccess
is an unwieldy name and is implicitly mutable, and ByteArray
is actually concerned with allocations. One could argue that ByteArrayAccess
is about accessing the length and pointer of an array of bytes, but then we should just call it ByteArray
and change the other class to Allocator
. But ByteArray/Access
doesn’t allocate / give you access to a byte array, it gives you an arbitrarily typed pointer of least length
bytes in size - but a pointer and an array are not the same thing, nor is the implicit presumption of a byte-addressable architecture necessarily a good one to make. It is also troublesome that one can allocate a ‘byte array’ using a Ptr Word64
but still have to describe its size in bytes. This just screams potential for unchecked exceptions and alignment problems, it again mixes or refuses to differentiate pointers vs arrays, and ignores the size information that the pointer’s type provides.
I could go on; about the only thing going for it is that these classes are then used to implement a generic ByteString
-like interface for all instances ByteArray
- yippee! That is actually very nice, (and explains the stupid naming convention), except we’ve voluntarily loaded a footgun by conflating pointers with arrays while simultaneously measuring in bytes instead of element count. In short, memory
made literally the exact same mistakes that C’s stdlib made (no surprise since that was deliberate mirroring), and it makes a pale comparison to Rust and Zig’s more modern memory and allocation tooling, let alone attempting to describe allocating a striped GPU vertex attribute buffer, or a non-traditional computing basis such as analog values and qubits. It is precisely these shortcomings that I wish to fix.
I have the violent opinion that one should only allocate memory in bytes when one is allocating an array of bytes (because it has that many bytes) OR as a void pointer (but only for a byte-addressable system). If we are allocating a pointer for a known type, we should use its size. I am also of the opinion that we should separate singular allocations from arrays and allocate using element count. The presumption of a byte-addressable architecture is also a problem - what of bit-addressable or word-addressible architectures? For a package that is supposedly about memory, it does a terrible job of it - for one, a memory package that doesn’t provide any allocators?
Enough ranting, can we do better? I think the first thing to do is re-work these core typeclasses - the rest of the library is actually mostly a grab-bag of low-level tools, so we can worry about that later.
Here is what I am throwing around as an alternative interface:
class (Monad m) => Allocator m alloc where
-- The memory type being allocated
type Memory alloc
-- The memory layout, such as element count and alignment, or even GPU vertex attribute layout
type Layout alloc
allocate :: alloc -> Layout alloc -> m (Memory alloc)
reallocate :: alloc -> Memory alloc -> Maybe (Layout alloc) -> Layout alloc -> m (Memory alloc)
deallocate :: alloc -> Memory alloc -> Maybe (Layout alloc) -> m ()
class (Allocator m alloc) => MonadAllocator m alloc where
getAllocator :: m alloc
NOTE:
memory
restricts its operations toIO
, probably to limit implementation complexity as well as the number of dependencies. It might be worth lifting this restriction. We can otherwise also consider an IOAllocator class that is fixed / specialized to IO.
This allows for shenanigans to hide the allocator, such as:
allocateM
:: forall m alloc . (MonadAllocator m alloc)
=> Layout alloc -> m (Memory alloc)
allocateM layout = do
alloc <- getAllocator @m @alloc
allocate alloc layout
And we can implement an allocator for base:Data.Array.Byte.ByteArray
data ByteArrayAllocator = ByteArrayAllocator
instance Allocator IO ByteArrayAllocator where
type Memory ByteArrayAllocator = ByteArray.MutableByteArray RealWorld
type Layout ByteArrayAllocator = ByteArrayLayout
{ layoutSize :: Size
, layoutAlign :: Align
}
-- NOTE: Null layout is 0,1 not 0,0
-- It contains the additive identity (size) and multiplicative identity (align)
-- This might be better described as the identity layout?
-- > A memory layout (for a pointer) consists of a size and an alignment, which
-- > presents the question: what is the identity / null layout?
-- > If we actually consider this, the natural answer is 0,1 where 0 is the
-- > additive identity (0 size) and 1 is the multiplicative identity (1 align
-- > aka any alignment) - in other words, the layout identity is the product of
-- > the additive and multiplicative identity
allocate _ layout = ByteArray.newAlignedPinnedByteArray (layoutSize layout) (layoutAlign layout)
reallocate _ ptr _ layout = ByteArray.resizeMutableByteArray ptr (layoutSize layout)
-- | Deallocating a ByteArray is a no-op since we rely on lazy gc >:(
deallocate _ _ _ = return ()
instance MonadAllocator IO ByteArrayAllocator where
getAllocator = return ByteArrayAllocator
And we could do the same for a C stdlib allocator, but I’ve run out of time. We’ll pick this up later - until then, thoughts?