1. The problem
I think records are my only great annoyance with Haskell-as-a-language at this point. A decade of work has been poured into figuring out a good solution to extend the record syntax for ADTs specifically, which comes with its own host of problems because ADTs are immutable and have sums in them. At the same time the language does not bother providing any support for those same updates on mutable datatypes, which is pretty much irreplaceable in low-level programming.
2. A solution
Libraries like vinyl
present a different approach, storing all record data at the type level. The implementation can then be defined over a specific datatype as one sees fit.
2.1. Single-level
After prototyping a bit and throwing out all the extensibility stuff, it seems like that approach can be reduced to a very lean type-level core:
data Field a = Symbol := a
type Fields a = [Field a]
class Add (k :: Symbol) (v :: a) (xs :: [Field a]) (ys :: [Field a]) | k v xs -> ys
class Find (k :: Symbol) (xs :: [Field a]) (v :: a) | k xs -> v
-- The following two are only needed for implementing records over arrays
class Length (xs :: [Field a]) (n :: Nat) | xs -> n
class Index (k :: Symbol) (xs :: [Field a]) (n :: Nat) | k xs -> n
Which then makes it trivial to define, for example, an array record that can be used in both immutable and mutable forms (mimicking Data.Primitive.SmallArray
):
newtype MutableSmallArrayRec m (xs :: [Field Type])
newtype SmallArrayRec (xs :: [Field Type])
-- Creation functions are omitted, but there are type-safe ways to do so.
freeze :: Length xs len => MutableSmallArrayRec IO xs -> IO (SmallArrayRec xs)
thaw :: Length xs len => SmallArrayRec xs -> IO (MutableSmallArrayRec IO xs)
assign :: (Find k xs v, Index k xs n) => v -> MutableSmallArrayRec IO xs -> IO ()
access :: (Find k xs v, Index k xs n) => MutableSmallArrayRec IO xs -> IO v
index :: (Find k xs v, Index k xs n) => SmallArrayRec xs -> v
A JWT, which is a JSON object, could be expressed as a type-safe dictionary in a similar manner.
It’s even generic enough that it can be used to fix the annoying Storable
typeclass:
data Offset = Offset Nat Type
class Allocable struct where
type SizeOf struct :: Nat
type AlignOf struct :: Nat
type FieldsOf struct :: [Field Offset]
newtype PtrRec (struct :: Type) = PtrRec (Ptr ())
assign
:: ( Find k (FieldsOf struct) ('Offset offset v)
, KnownNat offset
, Storable v
)
=> v -> PtrRec struct -> IO ()
access
:: ( Find k (FieldsOf struct) ('Offset offset v)
, KnownNat offset
, Storable v
)
=> PtrRec struct -> IO v
2.2. Nesting
Dot-notation can be added on top completely separately:
class DotNotation (k :: Symbol) (ks :: [Symbol]) | k -> ks
Then custom type classes can be used to chain individual segments.
For the FFI level array support could be added as well: consider foo.bar[3].honk
where "bar" := CArray 'UnknownSize Bar
(or perhaps 'KnownSize 4
).
3. The issues
3.1 Type families
Recursive type families are atrociously slow, but functional dependencies can be used instead to do the exact same thing, so this doesn’t need to be solved for this to work.
3.2. Lists
The big problem is that fields are described as a type-level list. This means symbol lookup and equality checks are slow, plus duplicates have to be manually pruned on insertion.
Within Haskell’s type-level system as is this appears unsolvable (as in “too painful to implement”), but I think a temporary solution could be a part of base
. Consider a type-level dictionary, somewhere in GHC.TypeFields
or similar:
data Fields a -- UTF-8 ordered dictionary from `Symbol` to `a`.
type family Empty :: Fields a
-- Fails on duplicate insertion.
-- Should only allow a subset of characters for coherent nesting.
type family Add (k :: Symbol) (v :: a) (xs :: Fields a) :: Fields a
-- Fails when no such symbol exists
type family Find (k :: Symbol) (v :: a) (xs :: Fields a) :: a
type family Size (xs :: Fields a) :: Nat
-- Fails when no such symbol exists
type family Index (k :: Symbol) (xs :: Field a) :: Nat
3.3. Nesting
Notation slicing functions are way more complex than list traversals, so baked-in type families would be far more efficient than a complex network of symbol conses, unconses and appends.
4. Conclusion
This feels like it should’ve been a GHC proposal made in 2018 by someone far more knowledgeable than me. It does not need any new Haskell extensions, only a small handful of helper functions in base
, which some day would be subsumed by a more powerful general approach to type-level programming. It can be freely extended by anyone willing outside of base
and It incurs no runtime overhead.
I wish I could make a plugin to see if I’m wrong on this, but I don’t know if an entire type-level dictionary can be implemented that way.