Hello Haskell community!
This is my first post so please be patient with me, especially if I use incorrect terminology or oversee something important/ obvious.
Also please excuse me if I motivate my example more extensively than I actually would have to.
So let’s skip to the chase:
I’ve had a lot of fun using Haskell classes recently and it’s one of the features I love the most about the language as it can express so many different semantics. However, working with it, I have had some issues with the fact there is no fine grained control between adding {-# OVERLAPPING #-}
and {-# OVERLAPPABLE #-}
pragmas and -XIncoherentInstances
.
First of all I of course want to note that I didn’t come up with everything in the code examples by myself but have borrowed some of the ideas from existing HList
-implementations.
To illustrate my issue I wrote up a little .lhs
file to understand my motivation for this and for you to actually try things out.
It can be found here and obviously in the following post:
Say we have the following datatype:
data HTree
= HLeaf Symbol Type
| HNode [HTree]
We can give it a singleton counter part for the value level by writing the following GADT:
type HSTree :: HTree -> Type
data HSTree as where
-- | Make the counterpart of a leaf
HSLeaf :: forall (label :: Symbol) (typ :: Type). typ -> HSTree ('HLeaf label typ)
-- | Make the counterpart for an empty node
HSNode :: forall . HSTree ('HNode '[])
-- | Insert an HSTree into a node
(:+>:) :: forall (root :: [HTree]) (node :: HTree).
HSTree node -> HSTree ('HNode root) -> HSTree ('HNode (node ': root))
infixr 5 :+>:
You can now surely build heterogeneous trees like so:
myHTree :: HSTree ('HNode '[ 'HLeaf "firstLeaf" Int, 'HLeaf "secondLeaf" String])
myHTree = HSLeaf @"firstLeaf" 42 :+>: HSLeaf @"secondLeaf" "foo" :+>: HSNode
But now you generally also would like to access the leafs and/or the nodes which you would probably want to do by performing a search for a label within the tree. Because of having to know that such a search must succeed for a given HSTree
you normally construct a proof that there is such a label with the use of a class
and a proof-GADT
, we’re now going to show that it’s not possible to create such a thing with the current Haskell instance resolution methods and I’m going to propose a fix to this.
Let’s try to write a function called htreeNode'
like so:
htreeNode' :: forall label e tree proxy. proxy label -> HSTree tree -> e
htreeNode' _ _tree = undefined
We first want to create a proof of the element with the label specified being part of the tree and then index the tree by using this proof, let’s first care about the proof:
First we need a type to build our proof with:
data TreeElem (name :: Symbol) (typ :: Type) (tree :: HTree) where
-- | Hurray, we found our element!
Here :: TreeElem name typ ('HLeaf name typ)
-- | provided we have a proof for the rest of the list for within the node
-- to contain an element, we can also construct a proof for it with one
-- more element
Farther :: TreeElem name typ ('HNode lst) -> TreeElem name typ ('HNode (x ': lst))
-- | provided we have a proof for an element in the node to contain a list
-- we can also construct a proof that the node that contains the element
-- contains the element
Deeper :: TreeElem name typ tree -> TreeElem name typ ('HNode (tree ': lst))
We now want to write a class that produces such a proof term like so:
class TreeElemOf (name :: Symbol) (typ :: Type) (tree :: HTree) | name tree -> typ where
treeElemOf :: TreeElem name typ tree
But before we try to do that, let’s take a step back and see what we’re actually trying to implement by taking another look at the htreeNode
function and trying to complete it a bit more with what we’ve done until now:
htreeNode ::
forall label e tree.
TreeElemOf label e tree =>
-- given we have a proof that a an element with name `label` and type `e` is in the tree
HSTree tree ->
e
-- we can extract that element from the `HSTree`
htreeNode =
let
-- the proof that that the element is actually in the tree
proof :: TreeElem label e tree
proof = treeElemOf @label @e @tree
in indexWithProof proof
indexWithProof
here takes a term that tells us how to index the tree and the tree that we’re trying to index.
Writing it should be easy:
indexWithProof :: forall label e tree. TreeElem label e tree -> HSTree tree -> e
-- we have already found the element, yey
indexWithProof Here (HSLeaf x) = x
-- the element is actually a bit farther within the node
indexWithProof (Farther rest) (_ :+>: xs) = indexWithProof rest xs
-- the element is deeper in the tree
indexWithProof (Deeper rest) (x :+>: _) = indexWithProof rest x
So let’s actually see that we can extract an element with that function:
secondLeaf :: String
secondLeaf = indexWithProof (Farther (Deeper Here)) myHTree
Evaluating this actually spits out “foo” as expected, you can try it out.
However, there’s one caveat here: we constructed the proof ourselves which isn’t really practical, after all we just want to name the node we want to extract and then get back the element with that name, like in htreeNode, e.g. we want to call htreeNode @"foo" myHTree
. However, if you try that out you may notice that GHC complains about a missing instance for TreeElemOf
for the HTree.
We will now motivate why we cannot write those and what we’d need to actually do so:
First let’s show that the instances would actually be trivial:
instance TreeElemOf name e ('HLeaf name e) where
treeElemOf = Here
So far, so good: if we have a Leaf, we found our element.
instance TreeElemOf name e x => TreeElemOf name e ('HNode (x ': xs)) where
treeElemOf = Deeper treeElemOf
instance TreeElemOf name e ('HNode xs) => TreeElemOf name e ('HNode (x ': xs)) where
treeElemOf = Farther treeElemOf
And we can actually write both of those instances independently of each other but we cannot write them together;
there is no way to do so as far as I’m aware. You can try out both of them independently and they will work, even
htreeNode
, as long as you don’t try to construct proofs of terms which the instance you chose to leave in cannot
create.
The issue with them is that instance resolution doesn’t consider the context of an instance head at all, hence for
the GHC instance resolution mechanism those two instances are actually exactly the same.
There are good reasons to do so because to not make an arbitrary choice while resolving the instance we would
necessarily have to
- check whether the instance contexts are disjoint.
- trust the programmer
We all know the latter is not an option but the first might be. I do not know the GHC internals but maybe it may be
possible?
My idea would be to have a syntax like this:
instance family TreeElemOf name e ('HNode (x ': xs)) where
TreeElemOf name e ('HNode xs) => treeElemOf = Farther treeElemOf
TreeElemOf name e x => treeElemOf = Deeper treeElemOf
With the following semantics:
- the instance whose context is the first one to match the given context is chosen
- if instance resolution fails for a candidate, the next candidate with a matching context is tried
- if none of the contexts in the instance family is matched, try a more general instance head (i.e. go on with instance resolution as usual)
- there can only ever be at most one instance family for each instance head
This would circumvent a couple of problems:
- the issue that sometimes you don’t know which modules export which instances, which potentially may be harmful because they may overwrite a context unexpectedly
- the issue that if you have exactly the same instance head but you can’t make sure the contexts are disjoint you would have an arbitrary instance chosen.
- the issue that with many instances your program will spend a lot of time in instance resolution and may choose an unexpected instance when not limited to a handful of instances to choose from
One issue with this is that we have to do unification and instance resolution in an alternating manner, which currently is not supported by GHC - if instance resolution succeeded, GHC doesn’t go back to it.
Apart from this, I would guess this would range somewhere between {-# OVERLAPPABLE #-}
and {-# OVERLAPPING #-}
pragmas and -XIncoherentInstances
wrt safety, as the programmer has to be careful that they choose the right order of matching contexts but it’s still not completely arbitrary like with -XIncoherentInstances
, I think limiting the issues of where the instance family is declared to a closed block also adds safety as it makes overviewing the - possibly risky - instances easier and more maintainable.
Thank you for reading, commenting and telling me what issues might linger within this - not only in general but also wrt - possibly - an actual implementation for GHC.