For clarity I won’t be using the phrase “immutable records” in this topic. Only “ADT products”, “field labels” and “record syntax”.
1. Status quo
Through the years there have been numerous attempts at implementing mutable records separately from GHC (see e.g. this Reddit thread from 2018). All of them have ultimately failed, as Haskell’s type system on its own is too verbose to make such a feature look good and too weak to support it properly.
The need for mutable records has however never left. The community chose to emulate them using ADT products, a task I argue they’re not fit for, leading to several curious outcomes:
-
Complaints about laziness as a norm.
The complaints are never about it being hard to use laziness, they’re about not being able to escape it. Everyone is signed up to the notion that they have to put their application state into an ADT product, at which point a single failure to fully evaluate the next edition of their entire state snowballs into a memory leak.
-
Partial ADT product updates as a feature.
ADTs cannot be updated in place, they’re constructed anew each time, right? Why do we then use
x { a = b }and notC { a = b, .. = x }? I have no idea how to reconcile nested updates inOverloadedRecordUpdatewith this, and this is currently the community-consensus solution to proper record syntax. -
Allocation cost for copies of ADTs as the cost of doing business.
Yes, I get it, it’s a bump-allocating garbage collector, allocations are dirt-cheap; no, I shouldn’t have to waste space and time duplicating references to things I’m not interacting with in the moment.
2. IORefs
The closest we can get to mutable records in current Haskell is an ADT product of IORefs. This allows accessing each element independently, with addresses each of the problems outlined above. It’s also obviously a terrible approach, as it requires allocating a number of IORefs linear to the number of values.
The documentation to MutVar#, which backs IORef, says
A
MutVar#behaves like a single-element mutable array.
A lot of people know about mutable arrays of values through vector, but in fact GHC supports these directly. There is however no nice way to access such arrays heterogeneously in today’s Haskell.
3. Mutable records
Consider the following module, taking both function names and behaviors from primitive#SmallArray (which in turn took from GHC.Exts):
module Data.Mutable where
-- Kind of mutable records
data MutableLayout
-- Instances for this type class are only created by the compiler.
class HasMutableField (x :: Symbol) (r :: MutableLayout) a | x r -> a
data Immutable (r :: MutableLayout)
index :: HasMutableField x r a => Immutable r -> a
data Mutable (r :: MutableLayout)
read :: HasMutableField x r a => Mutable r -> IO a
write :: HasMutableField x r a => Mutable r -> a -> IO ()
thaw :: Immutable r -> IO (Mutable r)
freeze :: Mutable r -> IO (Immutable r)
-- More variants, e.g. MutableST, can be added below.
We can then write a declaration like the following, mirroring a declaration of an ADT product with field labels:
{-# LANGUAGE MutableRecords #-}
mutable Con a b =
Con
{ a :: a
, b :: !(IntMap b)
, c :: {-# UNPACK #-} !Int
, d :: Mutable Foo
}
This declaration would:
-
Add type
Con :: Type -> Type -> MutableLayoutto the type-level; -
Add a constructor
Conthat serves as a record pattern synonym for
Immutable (Con a b); -
Have GHC infer the internal data representation for this mutable record;
-
Create a
HasMutableFieldinstance for each field, describing how each field is to be accessed.
Using the constructor and thaw is thus the only way to create a new mutable record, which can be read from and written to in IO, and converted back to immutable using freeze.
Caveats:
-
To support unpacking fields
[Im]mutablewill have to be backed by a special array that combines[Mutable]ByteArray#andSmall[Mutable]Array#. The data layouts for both are rather simple, this shouldn’t be a problem; -
Using record syntax to partially update an
Immutableshould not be allowed, as this can be done directly using mutation. This means thatx { a = b }can never refer to anImmutablepattern synonym, simplifying implementation; -
Record pattern synonyms are not a GHC feature at the moment, but I assume making GHC create special synonyms for this occasion is a separate, simpler, topic;
-
Nested record update would require a separate type class
(e.g.nestedWrite @'["foo", "bar", "baz"]), which could be simplified further with a magic type family that breaksSymbols on periods
(nestedWrite @"foo.bar.baz"); -
How GHC chooses to store the fields in a mutable record is an implementation detail that should not be exposed to the user. This feature is not here to solve FFI interoperability;
-
These records are not extensible, I’m not opening that can of worms.
And that’s it. Every part of this seems to already exist in GHC today, doesn’t seem like it would require any massive changes, it’s well-contained in a module and an extension.
4. GHC proposal
There’s only one issue here: the claims I’m making in the first section are quite outlandish. Virtually every real-world program out there is structured as described and the conclusion I’m ultimately drawing towards is that the syntax for x { a = b } should be deprecated (it’s in the Haskell98 report, section 3.15.3).
It is thus important for me to see what community’s views on this are and whether mutable records are viewed as a necessary, or at least interesting, feature. Should this gain enough traction I’ll expand it into a GHC proposal.