Using Witch to convert between structurally similar types

Witch is a library of mine that allows you to confidently convert between values of various types. I’m happy to announce version 1.2.3.1, which adds the ability to define generic conversions between types that are structurally similar.

https://hackage.haskell.org/package/witch-1.2.3.1/docs/Witch-Generic.html

In short, you can now do this:

ghci> data Toggle = Off | On deriving (Generic, Show)
ghci> deriving via Generically Toggle instance From Bool Toggle
ghci> into @Toggle True
On

Witch does this by converting the source type (s) into its generic representation (Rep s), converting that into the generic representation of the target type (Rep t), and then converting that into the target type (t). As long as the generic representations are the same, the conversion will work.

Witch also handles converting any types inside those types, provided that an appropriate From instance exists. For example you can do this:

ghci> data Pair a b = MkPair a b deriving (Generic, Show)
ghci> deriving via Generically (Pair c d) instance (From a c, From b d) => From (a, b) (Pair c d)
ghci> into @(Pair Integer Float) (1 :: Int, 2.3 :: Double)
MkPair 1 2.3

In addition to converting (,) into Pair, it also converts Int into Integer and Double into Float. That’s because Witch provides instance From Int Integer and instance From Double Float.

Thanks for reading! I hope you find this useful (or at least interesting).

16 Likes

I find this pattern of converting between generically-structurally matching types really interesting and powerful. I’ve got a somewhat related library called strongweak (asymmetric conversion?) that does the same thing. For design purposes, strongweak avoids recursive instances and ambiguous classes, which results in further interesting design (e.g. in the place of a From a a instance, generics have an OVERLAPPING instance to copy through matching types).

I think generics are often underappreciated for the power they can provide. Having access to definition metadata is also a huge boon for failable functions. (I wonder if TryFrom generics would be useful.)

2 Likes

Maybe you can simplify your code by depending on generics-sop. It has a helper called “SameShapeAs” which should make it possible to replace your implementations with a simple to . from

2 Likes

Fascinating! I must admit that I didn’t look for prior art before implementing this in Witch. I see that strongweak mentions generic-data, which provides copyRep for doing this as well.

I did consider trying to leverage TryFrom for structure-preserving conversions that could fail, but it seemed intimidating and I didn’t have an immediate use case for it.

Good to know! Honestly I was surprised at how little code was required to implement this using GHC.Generics. I’m not well versed in the various generic libraries though. Is there a compelling reason to prefer the generics-sop library over the generic machinery provided by the base library?

1 Like

The main advantage is that you don’t have to implement your recursion with classes since all of the structure is already reflected in the Rep types (N’ary sums of n’ary products, as GADTs).
They also have a bunch of nice combinators that do almost anything you want to do typically.

3 Likes

generics-sop doesn’t appear to provide access to definition metadata e.g. constructor names. Is that correct or did I miss something? I use it a lot for e.g. fleshing out error messages in generic parsers.

It does, even though I think the interface doesn’t focus on it very much.

https://hackage.haskell.org/package/generics-sop-0.5.1.4/docs/Generics-SOP-Type-Metadata.html

2 Likes

I find the idea technically interesting (must have been a fun to write!) but I’m a little concerned about safety of this mechanism. For instance in this example the programmer - and reviewer - has to remember the shape of Bool ADT (False then True) - otherwise the result would be incorrect; I think I’d prefer an explicit pattern match in this case. I’m curious about possible non-trivial use cases of this library.

EDIT: bonus question, in which circumstances we can use unsafeCoerce instead of repacking?

1 Like

Thanks! I didn’t look in the right place. That’s pretty cool.

This is already the case for very common type classes like Enum and Bounded

1 Like

It’s safe when you trivially know that the shape matches. To borrow from strongweak:

data Strength = Strong | Weak
type family SW (s :: Strength) a :: Type where
    SW Strong a =          a
    SW   Weak a = Weakened a
class Weaken a where type Weakened a
instance Weaken Word8 where type Weakened Word8 = Natural

data A (s :: Strength) = A
  { a1 :: SW s Word8
  , a2 :: String }

Here, A Weak and A Strong clearly have matching generic structures. I would feel comfortable writing deriving via Generically (A Weak) instance From (A Strong) (A Weak). The other direction requires a bounds check, so would be TryFrom, but Witch doesn’t provide generics for that. (strongweak does because it’s the pattern I’m most interested in.)

1 Like

I have a dumb question. (I was hoping I’d find the answer amongst the comments, but I’m not seeing it so far.)

Someone’s declared data Toggle = ... because they want the type safety of not confusing it with Bool. And they’re deriving the instances they think appropriate, including for diagnostics if a program goes wrong. Importantly they’re avoiding instances for classes they think not applicable given the usage they intend.

  • into @Toggle True seems a rather tedious way to write On.
  • If someone wants more meaningful/application-specific names (but not type hygiene) for Bool to hide behind, there’s type synonyms and pattern synonyms.
  • For “fleshing out error messages” isn’t a Show instance enough?

In short: what are some use cases for this genericising? Being structurally similar/same shape is no evidence for semantically similar; exactly because “N’ary sums of n’ary products” is so abstract.

1 Like

I don’t mean recording the value that failed, I mean recording data type, constructor, and field name (or index) where the failure occurred. See GStrengthen for an example. I use it especially in binrep, where I generate binary parsers & serializers from a plain ADT. That way, when I get a parse error, I know which part of the generic parser errored.

I uploaded a reverse engineering project I wrote using this generic pattern a while ago, along with lots of annotation. Take a look at data Seg (s :: Strength) a: it effectively defines two distinct data types, but with the same shape. strongweak’s generics can then write the code to convert between them, very similar to tfausak’s GFrom. I save a lot of manual labour this way-- I’d have to check tons of natural bounds and list sizes otherwise. (Assume that I need the two distinct representations.)

I don’t see a problem with similar shape not implying similar semantics. Witch could certainly write weird, probably unhelpful conversions, but it’s the user doing the asking.

Thanks raehik for trying, but no I don’t get it.

Ok so to achieve that generically you need the ... deriving Generic/TypeRep-provided mappings.

Why can’t you check the bounds and sizes of the Generic-provided metadata directly/that is, why convert to some “canonical” representation?

Which might be precisely the reason the author didn’t declare Enum, Bounded, Ord, ... instances. If by ‘bound’ you mean merely the textually first and last data contructors declared:

  • You can obtain those from the Generic-derived structure;
  • messages/diagnostics talking about bounds will confuse someone who deliberately avoided Bounded; or worse
  • If they deliberately declared a custom Bounded instance/not default deriving, messages might give ‘wrong’ constructors for the bounds.

IOW I do “see a problem with similar shape not implying similar semantics”; and I’d expect a bunch of utilities for parsing and serialising wouldn’t impose surprising semantics. (I guess if you keep that semantics purely internal/don’t expose it in diagnostics, it’s not doing actual harm.)

That was not my point. It was meant as a counter argument to the objection that these instances use structural similarity to derive semantics. This is the case for Enum and Bounded as well. If it doesn’t apply for your types, don’t use it, there’s still no obligation to use these libraries in all situations or a recommendation to do so.

It just says, IF the structural similarity is relevant, then here’s a tool to use it.

1 Like

I’m not checking metadata, I’m checking value bounds when going e.g. from Natural to Word8. Neither do I convert to a canonical representation (other than the generic Rep, but I assume you didn’t mean that one). That data type linked has a couple hundred fields in total, which is a couple hundred tedious bound checks that are written for me instead.

These were separate topics: my generic parsing happens to heavily use generic metadata, while the similar shape discussion was regarding Witch and strongweak. (Unrelated, but I pose my parser/serializer is a lot less surprising than binary/cereal, as it requires you to manually encode the constructor tag for sum types.)

I was quoting “canonical” from here that you link to.

Ok. Then I’m still not seeing a use case. Structural similarity is more a less an accident, because the formalism is so abstract … we might say, so generic :wink: .

1 Like

When I write “canonical” in the strongweak docs I don’t mean some universal representation, I mean “the type that gets the instances” (while the other is defined in an associated type family). I should tweak the docs to disambiguate-- thanks for clarifying.

I agree that doing checks “directly”/without altering values or changing representation is useful. You can do this with Witch and strongweak (e.g. convert between newtypes), it’s just not enforced.

I can see why using structural similarity in this way might feel weird. Indeed, it’s a convenience that only works when your types look a certain way-- or more likely, when you can make them look a certain way. I wouldn’t really use it without a trick like type family SW above, that guarantees structural similarity. In such cases, it’s not an accident, and it can save writing lots of boilerplate.

I used generics-sop in my dissertation for this sort of thing: lowarn/transformer/src/Lowarn/Transformer.hs at main · lowarn-dsu/lowarn · GitHub

(also see 3.5 type-driven state transformation here: lowarn/dissertation.pdf at main · lowarn-dsu/lowarn · GitHub)

I ended up taking it to extremes and implementing automatic reordering based on field/constructor names.

2 Likes