Improving `memory` with better abstractions

I’ve long been of the mind that memory needs to be improved. Due to concern about the library itself, it may be an opportune time to examine this library, and consider what we can do to improve it. Haskell has done a great job of pollenating other languages with concepts, and I have spent some time recently doing a lot of C, with some dabbling in Rust and Zig, all with a focus on memory management, allocators, alignment, and other dark arts that often left to the compiler in functional languages.

I think it is actively harmful to the community and to the language to consider such things as ‘beneath us’, since these things are usually considered an implementation detail that should never be visible. But think of what could be done with explicit control over allocators - what if we had a HasAllocator like we have HasCallstack? What if we could actually run a given function with a different allocator, and could guarantee immediate memory cleanup when it was done? Zig passes allocators explicitly, we have many options to hide them away as implicit arguments or behind monads. I think there is something to be learned, and brought back to Haskell - doubly so the rise of alternative compilers such as MicroHs, and the desire to compile efficiently for more architectures like embedded systems and WebAssembly.

To that end I have extracted the public API of memory for assessment and it consists of some 170-ish functions, classes, data types, and constructors. Not a small surface, to be sure, but not an insurmountable foe. Of course, this does not include the implementation code, which is another beast altogether - so we should take the time to seek to understand why certain things were implemented so. Also, some time ago I did a reverse dependency search to see how memory is actually being used (its here in some thread somewhere) and it turned out that a majority of use cases were limited to the same code - the ByteArray/Access class, the typely-sized variant ByteArrayN/SizedByteArray, and base encoding functions.

Anyway, lets get started.

The first thing worth considering is that the main class hierarchy - the ByteArrayAccess and ByteArray classes - are badly named, and do not properly separate their concerns anway. ByteArrayAccess is an unwieldy name and is implicitly mutable, and ByteArray is actually concerned with allocations. One could argue that ByteArrayAccess is about accessing the length and pointer of an array of bytes, but then we should just call it ByteArray and change the other class to Allocator. But ByteArray/Access doesn’t allocate / give you access to a byte array, it gives you an arbitrarily typed pointer of least length bytes in size - but a pointer and an array are not the same thing, nor is the implicit presumption of a byte-addressable architecture necessarily a good one to make. It is also troublesome that one can allocate a ‘byte array’ using a Ptr Word64 but still have to describe its size in bytes. This just screams potential for unchecked exceptions and alignment problems, it again mixes or refuses to differentiate pointers vs arrays, and ignores the size information that the pointer’s type provides.

I could go on; about the only thing going for it is that these classes are then used to implement a generic ByteString-like interface for all instances ByteArray - yippee! That is actually very nice, (and explains the stupid naming convention), except we’ve voluntarily loaded a footgun by conflating pointers with arrays while simultaneously measuring in bytes instead of element count. In short, memory made literally the exact same mistakes that C’s stdlib made (no surprise since that was deliberate mirroring), and it makes a pale comparison to Rust and Zig’s more modern memory and allocation tooling, let alone attempting to describe allocating a striped GPU vertex attribute buffer, or a non-traditional computing basis such as analog values and qubits. It is precisely these shortcomings that I wish to fix.

I have the violent opinion that one should only allocate memory in bytes when one is allocating an array of bytes (because it has that many bytes) OR as a void pointer (but only for a byte-addressable system). If we are allocating a pointer for a known type, we should use its size. I am also of the opinion that we should separate singular allocations from arrays and allocate using element count. The presumption of a byte-addressable architecture is also a problem - what of bit-addressable or word-addressible architectures? For a package that is supposedly about memory, it does a terrible job of it - for one, a memory package that doesn’t provide any allocators?

Enough ranting, can we do better? I think the first thing to do is re-work these core typeclasses - the rest of the library is actually mostly a grab-bag of low-level tools, so we can worry about that later.

Here is what I am throwing around as an alternative interface:

class (Monad m) => Allocator m alloc where

    -- The memory type being allocated
    type Memory alloc
    -- The memory layout, such as element count and alignment, or even GPU vertex attribute layout
    type Layout alloc

    allocate :: alloc -> Layout alloc -> m (Memory alloc)
    reallocate :: alloc -> Memory alloc -> Maybe (Layout alloc) -> Layout alloc -> m (Memory alloc)
    deallocate :: alloc -> Memory alloc -> Maybe (Layout alloc) -> m ()

class (Allocator m alloc) => MonadAllocator m alloc where
    getAllocator :: m alloc

NOTE: memory restricts its operations to IO, probably to limit implementation complexity as well as the number of dependencies. It might be worth lifting this restriction. We can otherwise also consider an IOAllocator class that is fixed / specialized to IO.

This allows for shenanigans to hide the allocator, such as:

allocateM
    :: forall m alloc . (MonadAllocator m alloc)
    => Layout alloc -> m (Memory alloc)
allocateM layout = do
    alloc <- getAllocator @m @alloc
    allocate alloc layout

And we can implement an allocator for base:Data.Array.Byte.ByteArray

data ByteArrayAllocator = ByteArrayAllocator

instance Allocator IO ByteArrayAllocator where

    type Memory ByteArrayAllocator = ByteArray.MutableByteArray RealWorld

    type Layout ByteArrayAllocator = ByteArrayLayout
        { layoutSize :: Size
        , layoutAlign :: Align
        }
    -- NOTE: Null layout is 0,1 not 0,0
    -- It contains the additive identity (size) and multiplicative identity (align)
    -- This might be better described as the identity layout?
    -- > A memory layout (for a pointer) consists of a size and an alignment, which
    -- > presents the question: what is the identity / null layout?
    -- > If we actually consider this, the natural answer is 0,1 where 0 is the
    -- > additive identity (0 size) and 1 is the multiplicative identity (1 align
    -- > aka any alignment) - in other words, the layout identity is the product of
    -- > the additive and multiplicative identity
    
    allocate _ layout = ByteArray.newAlignedPinnedByteArray (layoutSize layout) (layoutAlign layout)

    reallocate _ ptr _ layout =  ByteArray.resizeMutableByteArray ptr (layoutSize layout)

    -- | Deallocating a ByteArray is a no-op since we rely on lazy gc >:(
    deallocate _ _ _ = return ()

instance MonadAllocator IO ByteArrayAllocator where
    getAllocator = return ByteArrayAllocator

And we could do the same for a C stdlib allocator, but I’ve run out of time. We’ll pick this up later - until then, thoughts?

20 Likes

I like the general idea, but this specific type class seems overly generic in a bad way. The alloc function now basically has the signature a -> b -> m c which says almost nothing.

To define a good interface, I think we need some meaningful properties that these memory, layout, and allocator types follow. Or we could consider the denotation directly. For example, is Memory alloc equivalent to a function Layout alloc -> Byte (probably not but this is just to give some direction)?

If you reject that pointers and arrays are the same thing, then you should also reject that pointers can have a size.

Then surely the element type needs to be mentioned in the signatures of the class methods. For example, I would expect something more like: allocate :: alloc -> Layout alloc a -> m (Memory alloc a).

This allows for shenanigans to hide the allocator, such as:

But that requires using AllowAmbiguousTypes in (I think) a bad way; in the sense that you will need to use type applications whenever you want to use this function. Do we really need to pass the allocator explicitly? My intuition would be to hide it by default as part of the monad (e.g. as ReaderT alloc m).

3 Likes

I have no particular opinion on how control over allocators should look like — but I would like to mention that control over memory allocation is a design choice, not a value judgment. It’s a trade-off: I can spend my time thinking about how memory is laid out, and perhaps gain some performance improvements out of that, or I can spend my time focusing on functionality and algorithm complexity at a modest performance penalty. In all but exceptional circumstances, I prefer the latter.

Also, there is a substantial body of work on memory management in lazy functional languages like Haskell, consider for example weak pointers or compact regions.

3 Likes

It’s not necessarily a trade off. It’s possible you can improve memory’s API. You only have to trade off when you’re at some sort of peak. It’s possible that there’s ways to improve control of allocation without forcing it on the algorithm-writer.

Sometimes you can just do better rather than trade off!

2 Likes

I dont have time yet for full replies but they will be coming along with another post to elucidate things so I will be brief

I don’t disagree - consider this as a very vague template for other classes. ‘memory’ is a very nebulous / generic concept until you start pinning those pieces down. For the most part, we can usually assume 8-bit byte-addressable architectures running in IO that return a standard Ptr wrapped in some data or newtype - for the most part. I am hoping that we can add some constraints though - hence the desire for conversation.

This is a common belief, due to array decay in C, but pointers are not arrays, and I will stand on tat point - though they are somewhat duals.

Pointers are addresses that support pointer arithmetic that can be used to implement arrays, but pointers are about contiguous memory / addresses and offset, while arrays are indexed and are about constant-time access - very similar, but the devil is in the details. Pointers may be null, arrays must not. Pointers are relative and deal with alignment, packing, word size, addressing mode, etc, arrays are absolute and have simple indices.

I am not speaking of just the bog-standard standard C stdlib malloc, but also of things like kernel-level allocators that make such bog-standard simplicity possible :slight_smile:

I do prefer the monad approach myself, and I think with more concrete properties, eg an allocator of specific specific type will inform the other variables, even though under the hood eventually there is the host allocator. Like, an object pool still deals with bytes but they mask that with a better interface even though it must support the more low-level. Keep the perspective of dealing with very primitive# things.

I will respond to more later.

3 Likes

I have a practical question. The package name memory is lost to the Haskell community while its current maintainer chooses not to develop the package further but also to not ‘hand the keys’ to any alternative maintainer. My question is, what is the best way forward:

A: fork memory under a different name and then ‘build on those foundations’; or

B: ‘start from scratch’ with a package that has a different name?

3 Likes

Addendum to pointers are not arrays:

An array and a pointer are the same, up to a point - up to the array element type being the pointer’s type when it is also the addressable type (usually Byte addressable, but sometimes Bit or Word). Eg a pointer to bytes in a byte-addressable system is equivalent to a byte array - but only precisely then.

However this fails the moment you need to access multiple contiguous buckets - a pointer can implicitly simultaneously address some number of buckets starting at an arbitrary position as a single unit, an array must index into each slot individually to re-assemble the complete datum

Consider the implementation of a struct, and pointers to its member fields of various types. A pointer just needs an offset to access a field member, but to simulate struct fields an array must pretend multiple slots of the same type make up data of a different type and then cast to it - but casting arrays is generally a Very Bad Idea.

So pointers allow you to cast, but not all handle types (eg reference, array) allow casting.

It helps to remember that, a pointer to an array, is a pointer and not an array, and it is not the same as an array of pointers.

2 Likes

I would keep the original if I could, but I do not mind forking under a new name. ‘memalloc’ is my current thought, the other thought is to break it up into multiple smaller libraries with distinct intents - eg memory-allocators, memory-pointers, memory-encoding, etc

1 Like

If I implemented my memoria idea I would be (a) a ‘avoid bitrot’-type maintainer (b) definitely in the market for more skilled co-maintainers and (c) at least, a person who could be contacted and could ‘reach out to the community’ if something arose that needed action beyond my own capabilities.

Given the feedback on that idea, I was also thinking that memoria could become less ‘monolithic’ as a package by spinning parts out into other packages on which it then depended.

4 Likes

I think a record is a better match for this than a typeclass

data Allocator m memory layout = Allocator
  { allocate :: layout -> m memory
  , reallocate :: memory -> Maybe layout -> layout -> m memory
  , deallocate :: memory -> Maybe layout -> m ()
  }

All the typeclass methods require you to pass around the alloc anyway, so there’s no real difference in terms of usability. The records field accessors become functions with similar types

allocate :: Allocator m memory layout -> layout -> m memory
reallocate :: Allocator m memory layout -> memory -> Maybe layout -> layout -> m memory
deallocate :: Allocator m memory layout -> memory -> Maybe layout -> m ()
5 Likes

That’s a fair point — I mean the trade-off between a) “need to control memory allocation” and b) “I don’t particularly care about memory allocation”. Once the trade-off has been made in favor of a), improving the API of the memory package ranks high in desirability.

3 Likes

Uh, isn’t a record much more dangerous than an instance because there can be multiple record implementations with the same type, but only one instance? You probably don’t want to be able to allocate memory with one allocator and deallocate it with a different one.

Sure, but that’s a problem with the current typeclass implementation as well.

For example, the post defines instance Allocator IO ByteArrayAllocator, but there’s nothing to prevent someone also defining

data ByteArrayAllocator2 = ByteArrayAllocator2

instance Allocator IO ByteArrayAllocator2 where

    type Memory ByteArrayAllocator2 = ByteArray.MutableByteArray RealWorld

Then mixing and matching calls to allocate ByteArrayAllocator and deallocate ByteArrayAllocator2.

Good point, Memory should probably be associated data instead of type.

1 Like

You can probably do something prevent this at the type level. Add a type variable and ST trick it? Maybe?

I’d like to understand the “Memory”, “Layout”, and “Allocator” abstractions a bit better. As pointed out above this seems to allow the idea that you would ever want to use multiple types of allocators or layouts on the same memory (which I don’t think is true?).

Isn’t what you currently have essentially equivalent to:

class (Monad m) => Allocatable m p a | a -> p where
    allocate :: a -> m p
    free :: p -> m ()

because I don’t see how the Layout and Memory associated types are actually used by the abstraction. Why is Layout Maybe for deallocate?


Internet issues and I lost the second part of this post, removing the partial and re-posting below.

I agree with @jaror vis-a-vis starting with the denotation. I did a little thinking on this and have the following thoughts:

In Haskell we already have well-explored semantics for memory in a general sense (i.e. the storage and retrieval of values) in the State monad. State obviously is only concerned with a single value, whereas memory is this sort of inherently addressable structure. But, we could take the State interface and kind of twist it into a less general shape:

(I’m going to elide the specifics of typeclasses vs records etc here and just speak to the semantics)

-- p is the address/pointer type
-- v is the value type
Memory = 
    read :: p -> Maybe v
    write :: p -> v -> m ()
    size :: Int -- In terms of v

So let’s say you have some block of memory represented by a tupled (read, write, size) functions. Could we represent allocation as a localized coordinate transformation i.e.:

allocate :: Memory p v -> Int -> (Memory p v, Memory p v)

where the first result of the tuple is the memory offset and bounded by the given int and the second is the “rest” of the memory. Different allocation functions could take the argument read/write functions and map them (or transform the coordinate space) in as complex or simple a way as they like.

Deallocating is then like recomposing these two structures or un-applying the coordinate transform. (Maybe Memory stores its own reversal transformation? I suppose you need contextual information as well for many allocation strategies.)

Then you can imagine something like:

scale :: (Memory p a -> p -> b) -> Memory p b

Meaning if you can read some b out of a memory of a (e.g. a struct type out of a memory of bytes) then you can get memory in terms of b.

So you could allocate an array of a struct from a memory of bytes by composing scale and allocate.

I’m just spitballing and what I’ve outlined here might be very far from correct, but I’d like to see a coherent algebraic abstraction of memory rather than imitating Zig’s solution. I don’t believe passing around some parameterized struct that you can use to create IO is going to play to the strengths of Haskell.

1 Like

I still need to respond to many things here - oh I do have responses, but I also need to post what I have been working on, as it is a prelude to discussing those very things in some depth. It is rather wordy, but that is somewhat unavoidable. So, without further adieu, here is what I have been pondering and what I have written in preparation for the development of a relevant library (note that this is more or less meant to be the start of the README of said library):


Memory: A primer

Abstract: Why Haskell needs improved memory abstractions

Most people have a simplistic view of memory, even among programmers. These days it easy to encounter someone who has never stepped foot outside of a memory-managed language. After all, why should they?

That line of thinking is to the detriment of the ecosystem.

There is an over-reliance on using the operating system to provide seemingly infinite memory through the combination of a virtual address space and swapping pages to disk, as a result making developers less than mindful or dare I say even mindless when it comes to memory considerations. This is, after all, how we’ve ended up with eg bloated video games that require 64gb of RAM, 12gb VRAM and 300gb disk space.

Such a foundational topic must not be ignored; even Haskell has fallen prey to this mode of thinking before, resulting in now-historical issues regarding unpinned ByteStrings. Indeed, owing to what amounts to an utter lack of low-level memory abstractions, the Haskell RTS system is written with 50kloc of C, which is an embarassment for what is supposedly one of the most powerful languages on the planet.

NOTE: Although we have and will use the phrase ‘memory abstractions’, that phrase is not entirely accurate. We are in some sense dealing with the mirror-opposite of abstraction, describing what is way-below instead of what could be way-above. Perhaps concretion*, or co-abstraction (contra? counter?) would be a better term - ‘what is implemented’ and ‘how it is implemented’ both become open-ended (eg, abstract-ish) just in opposite directions / as different consequences of the same reasons. If this sounds like Abstract Nonsense, it’s because it is.

However, since it is a mirror concept, the syntax of typeclasses and instances suffices to discuss and ahem implement. The distinction is really only important for relating memory management to something called displacement, which in turn helps explain the need for a finitist and constructive approach to memory ‘abstractions’, and how this is necessarily a mirror / reflection / corollary of the pure functional approach being infinitist.

If this is confusing right now, do not worry, it is not the focus, just an aside. The only really important bit is the distinction between the finitist vs infinitist approach which will come up shortly, because computers have finite memory - or do they? Argh!

* I hesitate to call it ‘implementation’ (as a reflection of ‘abstraction’) since that term is already too loaded.

So what would improved memory abstractions get us?

So far, Haskell has relied on not needing to handle memory management, instead preferring to unsafely use ByteStrings for buffers and rely on the GC (garbage collector) to clean them up - this may suffice for common use cases, but is utterly untenable for many other use cases eg embedded or kernel development.

Maybe you remember what Haskell was like before the real-time GC, the stutter of latency caused by the stop-the-world GC that made Haskell an instant no-go for developing any application needing predictable real-time behavior - who wants to work in a language that can’t even run at 30fps?

Maybe it wouldn’t have taken until literally the 2020’s with GHC 8.10 to build and release the RTGC. Would it have taken so long if Haskell had the proper tools for describing efficient memory layouts and allocators and lifetimes, which languages like Zig and Rust now enjoy even as we do not?

As Heinrich said so eloquently that I must repeat it here, once the choice has been made in favor of “I need to control memory allocation” over “I don’t care about memory allocation”, improving the API of the memory package ranks high in desirability.

Why? Because we could rewrite more of the RTS in Haskell, provide better support for non-GHC compilers, embedded systems, kernel development, improve cryptographic tooling*, or even write a performance-competative game engine*. But we can’t do that without doing the work first to provide the requisite abstractions.

* These are some of my own reasons for wanting this. I used to be a game developer, there’s a reason I’m familiar with low-level memory.

So, let us endeavor to fix that, and dive on in.

What is Memory?

In order to abstract memory, we must define it. What are some facts about memory?

  • Memory is a place to access and / or store data.

This is a good start, but is vague enough as to be useless. Perhaps it would be better if we had some examples first. Then we shall try again.

What are some examples of memory?

  • RAM (Random-Access Memory)
  • ROM (Read-Only Memory)
  • WR+ (Write-Once, Read Many)
  • CPU registers and cache
  • A remote database
  • An (analog) audio tape
  • A quantum memory bank
  • A GPU
  • A turing tape
  • A book
  • A brain
  • A man in a room, with a printer, filing system, and OCR scanner

Obviously, some of these are a bit far-fetched, and yet they are undoubtedly all memory. A brain is certainly memory - the original, still made of meat - but it would be silly to try to characterize how the brain stores information right now. Not because it is impractical and useless, but because we have more basic abstractions to deal with first, before such things can become practical and useful.

Nevertheless, each of these illustrates some point or quirk of memory, and none of these are privileged; each is equally valid, because completeness is funny like that.

  • RAM is your standard concept of classical binary memory, which is the most common because it is so useful
  • The other types of read-write memory tell us that it isn’t just data structures that may be mutable or immutable - sometimes it is a restriction of the memory itself
  • The CPU registers and cache tell us that memory addresses may be temporary or even unobservable
  • The remote database shows us that memory need not be local nor synchronous
  • The analog tape and quantum bank show us that the memory unit need not be binary / digital or even deterministic
  • The GPU shows us that memory layouts may be complex, not directly addressable, and that units may be of indeterminate precision
  • The Turing tape shows us that memory need not be finite, may have ‘units’ of indefinite complexity, and may require relative addressing
  • The book, the brain, and the man show us that memory not even need be ‘part of a computer’*, only that the data be made accessible to it*.

* Or perhaps this is what it means to ‘be part of’ a computer. The answer is a matter of displacement. I am excited to write about that soon, but this comes first.

All of these types of memory have vastly different qualities / properties and thus different uses - but all are certainly ‘memory’. However it seems best to limit our focus to abstractions that are the most useful - so we will ignore the book, the brain, and the man. With the rest of these examples in mind, let us re-state our definition of memory:

  • Memory is a place for us to store information in a layout of one or more of some sort of units that may be addressed later to look it up.

That is much better. It tells us who (the user), what (storing data), where (in the address space), when (from now til later), why (to look up data), and how (according to some layout)

NOTE: The 5 W’s are an excellent tools for dissecting a topic for discussion via displacement

So:

  • Memory is about storing data in an address space.
  • Memory stores that data at an address that points to a set of one or more units of memory
  • Memory arranges that data in those units according to some layout
  • Memory controls storage according to access rules

Now we are getting somewhere - we have some concepts that make meaningful distinctions between various types of memory - addresses, units, layouts, and access which we must define first in order to later be prepared for concepts like pointers and arrays and allocators.

NOTE: Addresses and layouts are duals; don’t worry about how or why just know that they are related

Note that information is always stored, but not necessarily accessed later. This is because memory is only (ever) meaningful if it is written to (at least once*), and it is nonsensical to try to access data that has not been written. A printer is an example write-only memory. Well, only if you don’t count OCR / human re-input.

* Yes, even read-only memory needs to be written to at least once to be useful. Does that make the name stupid? Also, WR+ memory becomes ROM when written to - its more or less the same thing, just a pre- vs post- write state. Now, dynamic mutability, that’s an interesting property of memory to discuss, but definitely out of scope for what is supposed to be a primer.


Alas, this is all I have written up so far, coming soon in roughly this order:

  • Addresses and Address spaces
  • Memory units
  • Layouts
  • Access
  • References
  • Pointers
  • Arrays
  • Allocators

Until next time!

10 Likes

I think whole discussion is a bit confused. What sort of allocator are we talking about? One that is used to allocate standard haskell values? It’s integrated very deeply into RTS. How it’s even supposed to work?

On top of it GC interact poorly with prompt deallocation. If value is explicitly deallocated at some point one have problem how to ensure that it’s no longer reachable. Rust built whole language around linearity and borrow checker to make it work. Without solving this problem one gets use after free bugs and loses memory safety. Haskell’s tools for dealing with this are limited. ST like tricks, there’s research into regions.

Or are we talking about allocation of memory not managed by GC? How is going to be integrated into wider haskell ecosystem and how safe wrappers are created? Is ForeignPtr and bytestring-like wrappers are used? Some other mechanism?

3 Likes

Are you asking questions about Haskell? Or about GHC? Because the Haskell 2010 Language Report does not specify a garbage collector, or even mention one outside of hypotheticals in reference to the FFI, the hs_perform_gc function, and the StablePtr type. As long as we confuse the latter for the former, it will continue to dominate Haskell discourse, to the detriment of both.

  1. you have stated the problem excellently - that we are currently incapable of something is no reason to remain incapable, that is learned helplessness
  2. This is why part of the upcoming discussion involves LinearTypes
  3. Precisely - and this is why I would like to begin the process of providing tools to help alleviate that.

It is not an option for Haskell to remain ignorant of safe memory management practices, as that leaves us incapable of performing any duties that requires it, which is rather quite a lot of interesting things to abandon to other languages eg security and cryptography and networking, especially for a language one of who’s main draws is type safety and thus ought to be uniquely well-suited for it provided one can imagine an RTS other than GHC’s.

Rust had to build a garbage collector and borrow checker; we’re in a far better position because we can if we want to. GHC and Rust didn’t appear from nowhere, they all started in some small conversation like this.

  1. Yes because explicitly managed memory is an extremely powerful tool, and also we need it for interacting with and describing foreign memory (and not just C please) and also because we can describe the surface of the GHC RTS fairly simply as a garbage-collected memory-management system that provides no guarantees of prompt cleanup. A deeper dissection of its innards would of course take time.

  2. New interfaces to provide the same existing functionality as the memory package, but built on a deeper foundation that describes more than just byte-addressable arrays in a userland virtual address space that assumes OS endianness / word size, etc and also provides stock allocators including ones to represent the standard C allocator, the existing lazily-cleaned pinned ByteString allocator, and probably things like call stack, slab, and buddy allocators.

    I had the joy of playing with Elixier / Erlang again recently, and built-in bit-addressability is very nice. Our Data.Bits class sucks, because it conflates boolean logic with bit-addressable operations with byte-addressable operations and that makes it really annoying to work with just like Num

  3. Yes, I would like to make instances of the resulting classes for things like Ptr and ForeignPtr and StablePtr and ByteString and ByteArray# (<-here! this one is really important)

I want to describe what already exists, as it already exists, and then we can start improving it.


Finally, a rather large driving force behind this is specifically that the bindings to the botan cryptography library deal with a rather large amount of memory allocation that you don’t know the size of beforehand, and also needs prompt cleanup for security - presumably we can both agree that having sensitive data hanging around in GHC’s pinned bytestring allocation space indefinitely is a Very Bad Thing.

That, an improved ByteArray typeclass would still support the convert function which is the #1 most popular use case of the memory package if I recall accurately. This is useful for dealing with newtypes over bytestrings or foreign memory.

The existing ByteArray/Access classes are awkward though, I do intend to improve them because they already are split along the very lines of ‘reading’ and ‘allocating’ bytestring content, its just that 1) they make the allocator implicit to the class (which is not necessarily a bad thing, I just want to explore explicit allocators too) 2) they ignore mutable memory, which is one of the most interesting aspects of memory.

Remember, just because we can, doesn’t mean we (always) should. This applies to automatic garbage collection and implicit allocators.

I hope this exposes my internal thought process a bit more without being too… stark :grimacing: rather I am thankful for such pointed questions.

5 Likes