Disclaimer!!!
This has been done with the help of Gemini: Lord knows I can’t come up with the Type level stuff yet on my own. The words typed in this post are all my own; most of the Haskell code has been generated by Gemini bar some name changes while I poked and prodded and comments are mine.
As it currently stands, after three days (closer to one year on and off now), I still can’t recall/or independently come up with this specific solution, meaning I need a much more solid way to learn this.
Full conversation: here
Context
I’m working my way up to build different kinds of renderers for the first time and I was interested in texture formats and how best to represent them. Texture formats are represented by stuffing all the information into a byte-aligned container some-bits wide: often 32-bits. Furthermore, the bit patterns can vary depending on the format. For example: RGBA8 features 4x8 bit channels for the Red, Green, Blue and Extra channels and R9G9B9A5 features 3x9 bits for the R,G and B channels and 5 for the Extra channel etc.
I’ve been inspired mainly by Rebecca Skinner’s Intro to Type Level Programming with the instance theming and Haskell Unfolder Episode 28: Type families and Overlapping Instances and been trying to include more involved type stuff in my personal work and avoid my automatic inclination to use Typeclasses and KindSignatures for absolutely everything.
I’ll focus on the final result but if there is any interest in the initial solutions and why I rejected them, then I’ll share that info. Earlier solutions can be seen in the conversation.
The nicest solution (IMO) so far…
Key types, classes and instances
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE FlexibleContexts #-}
import Data.Bits (shiftL, (.&.), (.|.), shiftR)
import Data.ByteString.Builder (toLazyByteString, word16LE, word32LE)
import Data.ByteString.Lazy (LazyByteString)
import Data.Proxy
import GHC.TypeLits (KnownNat, natVal)
import GHC.TypeNats (Nat)
import GHC.Word (Word16, Word32)
import qualified GHC.TypeNats as Nat
-- | A field representing a value, the Nat corresponds to the width in bits of the field.
newtype Field (w :: Nat) = Field Word32
-- | A data to combine fields together to list the widths of each field in order
data BitStructure (fs :: [Nat]) where
End :: BitStructure '[]
(:&:) :: KnownNat w => Field w -> BitStructure ws -> BitStructure (w ': ws)
infixr 5 :&:
-- | A class that uses the type level Nats to manipulate the bits of
-- the final underlying structure
class BitManip (fs :: [Nat]) where
packFields :: BitStructure fs -> Word32
unpackFields :: Word32 -> BitStructure fs
totalWidth :: Proxy fs -> Int
instance BitManip '[] where
packFields End = 0
unpackFields _ = End
totalWidth _ = 0
instance (KnownNat w, BitManip ws) => BitManip (w ': ws) where
totalWidth _ = fromIntegral (natVal (Proxy @w)) + totalWidth (Proxy @ws)
packFields (Field val :&: rest) =
let width = fromIntegral (natVal (Proxy @w))
restWidth = totalWidth (Proxy @ws)
maskedVal = val .&. ((1 `shiftL` width) - 1)
in (maskedVal `shiftL` restWidth) .|. packFields rest
unpackFields word =
let width = fromIntegral (natVal (Proxy @w))
restWidth = totalWidth (Proxy @ws)
mask = (1 `shiftL` width) - 1
val = (word `shiftR` restWidth) .&. mask
in Field val :&: unpackFields word
-- | Lets us calculate the width of the type level list (aka, the widths of each field in the bit structure)
-- REQUIRES UndecidableInstances!!!
type family SumWidths a where
SumWidths '[] = 0
SumWidths (n ': ns) = n Nat.+ (SumWidths ns)
-- | Allows us to convert textures to records and back again based on a layout specified in the type-level list of widths.
-- Possibly my favourite thing so far.
class TextureRecord r where
type Layout r :: [Nat]
toStruct :: r -> BitStructure (Layout r)
fromStruct :: BitStructure (Layout r) -> r
--
data RGBA8 etype = RGBA8 {
r8 :: Word32,
g8 :: Word32,
b8 :: Word32,
a8 :: Word32
} deriving Show
instance TextureRecord (RGBA8888 e) where
type Layout (RGBA8888 _) = '[8,8,8,8]
toStruct (RGBA8888 r g b a) =
Field r :&: Field g :&: Field b :&: Field a :&: End
fromStruct (Field r :&: Field g :&: Field b :&: Field a :&: End) =
RGBA8888 r g b a
Results
The ‘A’ in RGBA
Firstly, I wanted to be able to determine what the “extra” value was in a field at the type level. A kind signature was effective here.
data ExtraType = Alpha | Emissivity
data RGBA8 (etype :: ExtraType) = RGBA8 { -- i.e. RGBA8 Alpha
r8 :: Word32,
g8 :: Word32,
b8 :: Word32,
a8 :: Word32
} deriving Show
Creating new format types
Appears to be as easy as defining the record and then specifying its layout in its TextureRecord instance. However, I currently do not like having to manually create the fromStruct and toStruct functions. This feels like it can be determined by whatever Layout Type is.
data RGBA8 (etype :: ExtraType) = RGBA8 {
r8 :: Word32,
g8 :: Word32,
b8 :: Word32,
a8 :: Word32
} deriving Show
instance TextureRecord (RGBA8 e) where
type Layout (RGBA8 _) = '[8,8,8,8]
toStruct (RGBA8 r g b a) =
Field r :&: Field g :&: Field b :&: Field a :&: End
fromStruct (Field r :&: Field g :&: Field b :&: Field a :&: End) =
RGBA8 r g b a
data R9G9B9A5 (etype :: ExtraType) = R9G9B9A5 {
r9 :: Word32,
g9 :: Word32,
b9 :: Word32,
a5 :: Word32
} deriving Show
instance TextureRecord (R9G9B9A5 e) where
type Layout (R9G9B9A5 _) = '[9,9,9,5]
toStruct (R9G9B9A5 r g b a) =
Field r :&: Field g :&: Field b :&: Field a :&: End
fromStruct (Field r :&: Field g :&: Field b :&: Field a :&: End) =
R9G9B9A5 r g b a
data RG16 (etype :: ExtraType) = RG16 {
r16 :: Word32,
g16 :: Word32
} deriving Show
instance TextureRecord (RG16 e) where
type Layout (RG16 _) = '[16,16]
toStruct (RG16 r g) =
Field r :&: Field g :&: End
fromStruct (Field r :&: Field g :&: End) =
RG16 r g
Using packed records
ghci> showHex (packRecord $ RGBA8 255 0 0 1) ""
"ff000001"
ghci> showHex (packRecord $ RGBA8 256 0 0 1) ""
"1"
ghci> showHex (packRecord $ RGBA8 257 0 0 1) ""
"1000001"
ghci> showHex (packRecord $ RGBA8 999 0 0 1) ""
"e7000001"
ghci> showHex (packRecord $ RGBA8 999 0 0x113 1) ""
"e7001301"
ghci> showHex (packRecord $ RGBA8 999 0 0xff 1) ""
"e700ff01"
ghci> showHex (packRecord $ RGBA8 0x77 0x56 0xff 0x12) ""
"7756ff12"
ghci> :t RGBA8888 0x77 0x56 0xff 0x12
RGBA8888 0x77 0x56 0xff 0x12 :: RGBA8888 etype
I need to figure out how to handle values that are too wide for the field. Naturally, the SumWidths instance can only tell if the Fields add up to too great a width, but there is nothing to otherwise constrain the values.
Cons
Currently, all records are specified as Word32 . This feels particularly wasteful when the RGBA8 records could be specified as four Word8s. Changing Field into a datatype with constructors Field8 Word8 | Field16 Word16 | Field32 Word32 feels like the most sensible thing to do.
I tried doing so, but the main pain comes when trying to make the unpackFields function work. I need a function from (Int/Nat) →(CanBeFielded a → Field).
unpackFields word =
let width = fromIntegral (natVal (Proxy @w))
getFieldT w
| w <= 8 = Field8
| w <= 16 = Field16
| otherwise = Field32
restWidth = totalWidth (Proxy @ws)
mask = (1 `shiftL` width) - 1
val = (word `shiftR` restWidth) .&. mask
in (getFieldT width) val :&: unpackFields word
• Couldn't match type ‘Word32’ with ‘Word8’
Expected: Word8 -> Field w1
Actual: Word32 -> Field w1
• In the expression: Field32
In an equation for ‘getFieldT’:
getFieldT w
| w < 16 = Field8
| w < 32 = Field16
| otherwise = Field32
In the expression:
let
width = fromIntegral (natVal (Proxy @w))
getFieldT w
| w < 16 = Field8
| w < 32 = Field16
| otherwise = Field32
restWidth = totalWidth (Proxy @ws)
....
in (getFieldT width) val :&: unpackFields word
Where next?
I think one more thing I would really like is to be able to use generic programming to generate the TextureRecord instances more automatically. This feels like something that should be possible but I’m not quite sure how yet and I refuse to prompt Gemini/ask for more help until I can freely replicate this solution independently. This will be my first time attempting anything with GHC.Generics so a lot to read and prepare for.
Recently I encountered this in the toml-parser package where you could deriving Generic` and then deriving (ToTable, ToValue, FromValue) via GenericTomlTable Type which I would really like to replicate here
Final Thoughts
I quite like this solution so far, it satisfies my immediate requirement which was to have a nice way of interacting creating arbitrary texture formats.
A list of field widths is fairly intuitive and not something that I will be changing once defined which is pretty awesome. Furthermore, packing and unpacking 32-bit values. (In reality, I can’t see myself needing to unpack them either which is great!).
I can see how this pattern can extend out to other bit widths which is pretty exciting.
Another thing I am not comfortable with is the wealth of language extensions required to get this working so far. At the moment, I’ve been just enabling extensions whenever the compiler or tutorial suggests I do but now I’ll definitely be doing more reading and making more effort into understanding what each extension does.
On a more personal note
Given how much I’ve struggled to learn and understand type level programming, it is disheartening to see the machine do it so effortlessly and almost flawlessly. In one sense, it’s great to have this tool available but in another sense, after three years of progress, it’s a little bit crushing to feel so far off despite how far I’ve come.
Oh well. Such is life! Time to build a heckin’ renderer.