Type classes and data structures

Apologies for the fuzzy title - I don’t really know what exactly I’m looking for, so it’s hard to verbalize…

My toy problem is this: I want to come up with a minimalist abstraction for graphs in order to generate graphviz/dot renderings for arbitrary data representations. In general, these may come in two flavors:

  • edges embedded in the data type (e.g. tree nodes)
  • edges from an external data structure (e.g. an incidence list)

In Scala, I might do something like this:

trait Graph[A] {
  def edges(v: A): List[A]

// instance for an "embedded" case
implicit def binTreeGraph[A]: Graph[BinTree[A]] = {
    case BinLeaf(_) => List.empty
    case BinINode(_, l, r) => List(l, r)

// scaffolding for an "external" case
type IncidenceList[A] = List[(A, List[A])]

def incidenceGraph[A](incs: IncidenceList[A]): Graph[A] =
  (v: A) => incs.find(_._1 == v).fold(List.empty[A])(_._2)

// usage
def printEdges[A : Graph](a: A): Unit = ???
printEdges(BinINode(2, BinLeaf(1), BinLeaf(3)): BinTree[Int]) // embedded
printEdges("a")(incidenceGraph(List("a" -> List("b", "c")))) // external

I’m wondering how to tackle this in Haskell. Naively going with a type class approach and the same API…

class Graph a where
  edges :: a -> [a]

…I can easily write instances for embedded cases. But what about the external variants? The best I could come up with is this:

type IncidenceList a = [(a, [a])]

class IncidenceListProvider a where
  incidences :: IncidenceList a

newtype Vertex a = Vertex a deriving (Eq, Show)

instance (IncidenceListProvider a, Eq a) => Graph (Vertex a) where
  edges (Vertex v) = Vertex <$> fromMaybe [] (lookup v incidences)

This kind of works…

newtype StrV = StrV Text deriving (Eq, Show)

instance IncidenceListProvider StrV where
  incidences = [(StrV "a", [StrV "a", StrV "b"]), (StrV "b", [StrV "c"])]

print $ edges (Vertex (StrV "a"))

…but it feels odd and convoluted. I’m sure there’s better approaches. Any suggestions?

I think you are running into one of the big differences between Scala’s traits/implicits and Haskell’s type classes. In Haskell you can only have one instance per type, while in Scala you can define multiple local instances and select which one you mean or have it choose automatically.

In Haskell you can do something similar by just using a record of functions:

data Graph a = Graph
  { elems :: [a]
  , edges :: a -> [a]

-- embedded
data BinTree a = BinLeaf a | BinINode a (BinTree a) (BinTree a)

instance Show a => Show (BinTree a) where
  show (BinLeaf x) = "BinLeaf " ++ show x
  show (BinINode x _ _) = "BinINode " ++ show x

binTreeElems :: BinTree a -> [BinTree a]
binTreeElems t = t : case t of
  BinLeaf _ -> []
  BinINode _ l r -> binTreeElems l ++ binTreeElems r

binTreeGraph :: BinTree a -> Graph (BinTree a)
binTreeGraph t = Graph (binTreeElems t) $ \case
  BinLeaf _ -> []
  BinINode _ l r -> [l, r]

-- external
type IncidenceList a = [(a, [a])]

incidenceGraph :: Eq a => IncidenceList a -> Graph a
incidenceGraph xs = Graph (map fst x) $ \x -> fromMaybe [] (lookup x xs)

printEdges' :: Show a => Graph a -> IO ()
printEdges' (Graph elems edges) = for_ elems $ \x ->
  putStrLn (show x ++ " -> " ++ show (edges x))

main1 = do
  printEdges' (binTreeGraph (BinINode 2 (BinLeaf 1) (BinLeaf 3)))
  printEdges' (incidenceGraph [("a", ["b","c"])])

But then you can no longer directly call printEdges on the BinTree. For that you could use a type class:

class IsGraph a where
  type Elem a
  toGraph :: a -> Graph (Elem a)

instance IsGraph (Graph a) where
  type Elem (Graph a) = a
  toGraph = id

instance IsGraph (BinTree a) where
  type Elem (BinTree a) = BinTree a
  toGraph = binTreeGraph

printEdges :: (Show (Elem a), IsGraph a) => a -> IO ()
printEdges = printEdges' . toGraph

main2 = do
  printEdges (BinINode 2 (BinLeaf 1) (BinLeaf 3))
  printEdges (incidenceGraph [("a", ["b","c"])])

Edit: Wait there is a problem, can you show the implementation of printEdges? I feel like there is something I’m missing. Specifically how does the printEdges know where to start? How can it find the root of the binary tree?

Edit2: I have updated the code with my own solution by just adding a list of all elements of the graph to the record.


Thanks a lot for the detailed answer!

I’m sorry - re-reading, this is somewhat misleading/ambiguous, indeed. The naive idea is that, for a Graph a, all the values of a are the set of vertices for the graph and the given value of a is the “root” vertex for the operation (rather than a representation of “the graph”). printEdges is just an arbitrary placeholder for any function that wants to operate on the graph structure, given a root vertex.

So, using your BinTree type (and otherwise my code from above):

instance Graph (BinTree a) where
  edges = \case
    BinLeaf _ -> []
    BinINode _ l r -> [l, r]

printEdges :: (Graph a, Show a) => a -> IO ()
printEdges v = traverse_ print (edges v)

main = do
  printEdges $ Vertex (StrV "a")
  printEdges $ BinINode 2 (BinLeaf 1) (BinLeaf 3)

-- output:
-- Vertex (StrV "a")
-- Vertex (StrV "b")
-- BinLeaf 1
-- BinLeaf 3

Of course I cannot get all vertices (or edges) of a graph this way, only those that are reachable from the root.

reachable :: (Graph a, Eq a) => a -> [a]
reachable = allEdges' []
    allEdges' seen v =
      if v `elem` seen
        then []
        else v : concat (allEdges' (v : seen) <$> edges v)

The “embedded” case maps straight to the Scala code. For the “external” case, the Scala code exploits that a) you can have multiple local instances (as you already pointed out) and that b) such an instance can “close over” an incidence list data structure, i.e. reference it as an (OO) instance member. I have tried to emulate the latter with the IncidenceListProvider dance, but this just doesn’t feel right. In particular, I’m willing to go with one level of newtype nesting (in order to emulate the “multiple local instances” aspect), but the double nesting with the Vertex (which seems to be required to get a Graph instance depending on an IncidenceListProvider instance) is a bit too much…

If I understand correctly, with these semantics for the Graph class and your proposed solution, I’d still have to explicitly provide both, the concrete graph and the root vertex (even if these happen to be the same value, as in the embedded case). I was hoping for a solution where I only need to provide the root vertex and the graph comes “for free”, i.e. is inferred somehow. (Note that in the Scala example I’m explicitly passing the incidence graph as an argument, but it could as well be an implicit in scope, and thus “for free”, too - well, kind of…)

Note that I’m not insisting on this specific graph representation at all. This just seemed the simplest thing to do in Scala that was mostly uniform between embedded and external mode and easy and quick to adapt for whatever structure I might want to display or otherwise handle as a simplistic graph. Now I’m looking for some idiomatic Haskell solution for the same requirements, no matter what shape.