 # Getting fields from the "union" of two records

I mistakenly believed that it was impossible to do in Haskell (because having a “disjunction” of constraints is tricky) but after reading section 3.1 of Hiromi Ishii’s functional pearl “Witness Me — Constructive Arguments Must Be Guided with Concrete Witness” turns out it’s doable!

Here’s a gist with an example. The gist defines a helper datatype `data RU r1 r2 = RU r1 r2` for the “union” of two records, along with a suitable `HasField` instance.

First, we must define a type-level binary search tree:

``````data Map symbol q
= E
| N (Map symbol q) symbol q (Map symbol q)
deriving (Show, Eq)
``````

Along with some helper type families. Here are their kind signatures:

``````type Insert :: Symbol -> q -> Map Symbol q -> Map Symbol q
-- entries in the first map win in collisions
type Combine :: Map Symbol q -> Map Symbol q -> Map Symbol q
type Find :: Symbol -> Map Symbol q -> q
-- makes all entries in a map equal to some "value" of kind q
type Overwrite :: q -> Map symbol z -> Map symbol q
-- gets the field map of a conventional record, using Generics
type RecordCode :: Type -> Map Symbol Type
``````

Having done that, the next step is to define a version of `HasField` which takes an extra parameter of kind `Location`, which tells us where to find the field:

``````data Location =
InFirst
| InSecond

type UnionHasField :: Location -> Symbol -> Type -> Type -> Constraint
class UnionHasField location x r a | location x r -> a where
getFieldU :: r -> a
``````

Here are the instances for `RU`:

``````instance HasField k r1 a => UnionHasField InFirst k (RU r1 r2) a where
getFieldU (RU r1 _) = getField @k r1

instance HasField k r2 a => UnionHasField InSecond k (RU r1 r2) a where
getFieldU (RU _ r2) = getField @k r2
``````

The final step is to define a `HasField` instance for `RU` which delegates to `UnionHasField`. And, while delegating, we must also calculate the `Location` of each field of the union:

``````instance
UnionHasField
(Find k (Overwrite InFirst (RecordCode r1) `Combine` (Overwrite InSecond (RecordCode r2))))
k
(RU r1 r2)
a =>
HasField k (RU r1 r2) a
where
getField r = getFieldU @(Find k (Overwrite InFirst (RecordCode r1) `Combine` (Overwrite InSecond (RecordCode r2)))) @k r
``````

We obtain the map of fields of `r1`, and we make all the “values” equal to `InFirst`. The same for `r2` but with `InSecond`. Then we combine the maps and find the record in which we should find the field `k`.

(It’s a bit annoying that we must write that complicated type family expression two times, one in the constraints and one in the type application. It might even be slower to compile. How to avoid that?)

Now we can do something like:

``````data Person = Person {name :: String, age :: Int}
deriving (Show, G.Generic)
data Address = Address {street :: String, number :: Int}
deriving (Show, G.Generic)

personAddress = RU (Person "John" 50) (Address "Rue del Percebe" 2)

main :: IO ()
main = do
print \$ getField @"name" personAddress
print \$ getField @"age" personAddress
print \$ getField @"street" personAddress
print \$ getField @"number" personAddress
``````
3 Likes

You can bind it to a type variable:

``````instance
( loc ~ Find k (Overwrite InFirst  (RecordCode r1)
`Combine` Overwrite InSecond (RecordCode r2))
, UnionHasField loc k (RU r1 r2) a
) => HasField k (RU r1 r2) a
where
getField r = getFieldU @loc @k r
``````

Here is my full code with a naive unbalanced map implementation (I totally missed your gist link…):

``````{-# LANGUAGE StandaloneKindSignatures #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE AllowAmbiguousTypes #-}

import GHC.TypeLits
import GHC.Generics
import Data.Kind
import GHC.Records

data Map symbol q
= E
| N (Map symbol q) symbol q (Map symbol q)
deriving (Show, Eq)

type Insert :: Symbol -> q -> Map Symbol q -> Map Symbol q
type family Insert a b c where
Insert k v E = N E k v E
Insert k v (N l' k' v' r') = Insert' (CmpSymbol k k') k v (N l' k' v' r')

type Insert' :: Ordering -> Symbol -> q -> Map Symbol q -> Map Symbol q
type family Insert' a b c d where
Insert' GT k v (N l' k' v' r') = N l' k' v' (Insert k v r')
Insert' _  k v (N l' k' v' r') = N (Insert k v l') k' v' r'

-- entries in the first map win in collisions
type Combine :: Map Symbol q -> Map Symbol q -> Map Symbol q
type family Combine a b where
Combine E y = y
Combine x E = x
Combine (N l1 k1 v1 r1) (N l2 k2 v2 r2) = Combine' (CmpSymbol k1 k2) (N l1 k1 v1 r1) (N l2 k2 v2 r2)

type Combine' :: Ordering -> Map Symbol q -> Map Symbol q -> Map Symbol q
type family Combine' a b c where
Combine' LT (N l1 k1 v1 r1) (N l2 k2 v2 r2) = Combine r1 (N (Combine (N l1 k1 v1 E) l2) k2 v2 r2)
Combine' EQ (N l1 k1 v1 r1) (N l2 k2 v2 r2) = N (Combine l1 l2) k1 v1 (Combine r1 r2)
Combine' GT (N l1 k1 v1 r1) (N l2 k2 v2 r2) = Combine l1 (N l2 k2 v2 (Combine (N E k1 v1 r1) r2))

type Find :: Symbol -> Map Symbol q -> q
type family Find a b where
Find k (N l' k' v' r') = Find' (CmpSymbol k k') k (N l' k' v' r')

type Find' :: Ordering -> Symbol -> Map Symbol q -> q
type family Find' a b c where
Find' LT k (N l _ _ _) = Find k l
Find' EQ _ (N _ _ v _) = v
Find' GT k (N _ _ _ r) = Find k r

-- makes all entries in a map equal to some "value" of kind q
type Overwrite :: q -> Map symbol z -> Map symbol q
type family Overwrite a b where
Overwrite _ E = E
Overwrite v (N l k _ r) = N (Overwrite v l) k v (Overwrite v r)

-- gets the field map of a conventional record, using Generics
type RecordCode :: Type -> Map Symbol Type
type family RecordCode a where
RecordCode t = RecordCode' (Rep t)

type RecordCode' :: (k -> Type) -> Map Symbol Type
type family RecordCode' a where
RecordCode' (D1 _ (C1 _ x)) = RecordCode'' x

type RecordCode'' :: (k -> Type) -> Map Symbol Type
type family RecordCode'' a where
RecordCode'' (S1 ('MetaSel ('Just fieldName) _ _ _) (Rec0 fieldType)) = Insert fieldName fieldType E
RecordCode'' (S1 ('MetaSel ('Just fieldName) _ _ _) (Rec0 fieldType) :*: xs) = Insert fieldName fieldType (RecordCode'' xs)

data Location =
InFirst
| InSecond

type UnionHasField :: Location -> Symbol -> Type -> Type -> Constraint
class UnionHasField location x r a | location x r -> a where
getFieldU :: r -> a

data RU r1 r2 = RU r1 r2

instance HasField k r1 a => UnionHasField InFirst k (RU r1 r2) a where
getFieldU (RU r1 _) = getField @k r1

instance HasField k r2 a => UnionHasField InSecond k (RU r1 r2) a where
getFieldU (RU _ r2) = getField @k r2

instance
( loc ~ Find k (Overwrite InFirst  (RecordCode r1)
`Combine` Overwrite InSecond (RecordCode r2))
, UnionHasField loc k (RU r1 r2) a
) => HasField k (RU r1 r2) a
where
getField r = getFieldU @loc @k r

data Person = Person {name :: String, age :: Int}
deriving (Show, Generic)
data Address = Address {street :: String, number :: Int}
deriving (Show, Generic)

personAddress = RU (Person "John" 50) (Address "Rue del Percebe" 2)

main :: IO ()
main = do
print \$ getField @"name" personAddress
print \$ getField @"age" personAddress
print \$ getField @"street" personAddress
print \$ getField @"number" personAddress
``````
1 Like

You can bind it to a type variable

Ah, you’re right! I forgot one could do that.

I was thinking of a related limitation which does seem to exist. Suppose we want to define an extensible records library based on the type-level tree:

``````type Record :: Map Symbol Type -> Type
data Record t where
Empty :: Record E
Node :: Record left -> v -> Record right -> Record (N left k v right)

type Insertable :: Symbol -> Type -> Map Symbol Type -> Constraint
class Insertable k v t where
type Insert k v t :: Map Symbol Type
insert :: v -> Record t -> Record (Insert k v t)
``````

Now `Insert` is an associated type family of an `Insertable` typeclass. The class is now necessary because we want to manipulate `Record` values.

We do the “delegate into an auxiliary typeclass with extra parameters” trick:

``````type Insertable' :: Ordering -> Symbol -> Type -> Map Symbol Type -> Constraint
class Insertable' ordering k v t where
type Insert' ordering k v t :: Map Symbol Type
insert' :: v -> Record t -> Record (Insert' ordering k v t)

instance Insertable' (CmpSymbol k k') k v (N left k' v' right) => Insertable k v (N left k' v' right) where
type Insert k v (N left k' v' right) =  Insert' (CmpSymbol k k') k v (N left k' v' right)
insert = undefined
``````

Alas, now it seems that `(CmpSymbol k k')` must be repeated in both the constraints and the body of `Insert`. If we try the equality constraint trick, we get:

``````Main.hs:61:1: error:
* Type variable `foo' is mentioned in the RHS,
but not bound on the LHS of the family instance
``````

I’m not sure if this limitation merely one of verbosity, or if it actually slows down compilation with repeated evaluations of `CmpSymbol k k'`.

I think it also slows down compilation, because I don’t think there is type level common subexpression elimination. If you’re only worried about verbosity then you can also write type synonyms:

``````type FindU k r1 r2 = Find k (Overwrite InFirst  (RecordCode r1)
`Combine` Overwrite InSecond (RecordCode r2))
``````

But this doesn’t really help for `CmpSymbol`.

1 Like

Incidentally, the manually defined `HasField` instance for `RU` seems to work fine with `OverloadedRecordDot` in GHC 9.2.1-alpha1:

``````{-# LANGUAGE OverloadedRecordDot #-}
main :: IO ()
main = do
print \$ personAddress.name
print \$ personAddress.age
print \$ personAddress.street
print \$ personAddress.number
``````
2 Likes