I’m sure someone has had this idea before, but who knows:
Ever since I knew how data types are represented in Haskell (shameless plug; if you don’t, maybe watch my latest video playing around with ghc-vis
), I wonder if I woudn’t rather want something more memory-compact than that in some cases.
And now I wonder how well it would work to have, say, a library that you feed a data type definition (via TH), and it produces a type definition, a set of pattern synonyms and a complete pragma so that:
- In terms of programming, it’s indistinguishable from the plain data type.
- In terms of semantics, it’s indistinguishable (maybe require all constructor fields to be strict for that)
- But it’s backed by a very different implementation. For example all values are
ByteArray
with a very compact encoding of the value, so that even a nested, complex value fits into a few bytes. Or, if it isn’t recursive and thus the encoding has limited size, the implementation could just be one or a fewWord64#
.
Depending on the precise implementation, I could imagine that pattern matching becomes similarly fast as now (read a byte from a specific offset). Or maybe projections can be done by slicing the compact binary represention, and thus don’t allocate a lot.
WDYT? Has this been tried before? What encoding might be the best?
The idea comes partly from seeing GHC developers use short-hand descriptions like 1!P(L,L)
for values that would be rather large in memory, and I wonder if maybe GHC itself should use a similar encoding for more memory efficiency.
Oh, and actually looking at code related to that I find a very related idea: The Card
type looks like an enumeration, and can be used like it thanks to pattern synonyms, but it’s actually just an Int
.
Although there the underlying representation is meaningful as a bitvector and some operations are implemented on that, so it’s a hand-crafted application of the idea, and not a generic tool. Smells like @sgraf to me … ah yes, git blame
confirms that