HKD: best or worst thing ever?

sullyj3 · May 1, 2024, 1:42pm

“Higher kinded data” is apparently a common pattern in Haskell.

data Foo f = Foo {
  field1 :: f Int,
  field2 :: f String,
  field3 :: f Bool
}

This allows you to instantiate f at different functors (eg Maybe, Last) to get different functionalities, or Identity if no extra functionality is desired.

There are various articles about HKD and related topics, nonexhaustively including:

and various libraries for working with them, nonexhaustively including:

I haven’t used them in anger before. So, the purpose of this post is to get a sort of community sentiment check on the state of HKD in 2024:

In your opinion are HKDs good or bad overall?
Are they worth the extra boilerplate? Is there much boilerplate?
In various past threads I’ve read, people have seemed to often eventually encounter arcane infelicities that make their use frustrating. Is this still the case now, or have the difficulties mostly been overcome?
Do the aforementioned libraries make their use ergonomic enough? Are there other commonly used libraries that I’m missing?
What are the worst downsides you know of that can’t be avoided?
Are they used in industry? Is this technique still considered relatively experimental, or is it considered pretty well-established/proven? I’m particularly interested in the opinions of people who have skin in the game here.

Vlix · May 1, 2024, 1:56pm

I haven’t really needed them… almost ever?
We do have some complex record types that use type families to determine which fields are required or optional. We could have used HKD if it weren’t for the fact that in one case fieldA is optional, but in the other case fieldB is optional. And doing that with HKD just makes it more annoying, IMHO.

If you have a case where parts or everything starts out optional, and later you make it concrete and set defaults and w/e, then HKD could work. But often you just default before making the original type which just has no optional fields.

I would like to hear from others who do use this pattern and in which situations.

mixphix · May 1, 2024, 3:55pm

The Prairie library for working with record types has been occupying a lot of my brain space recently. I work on a Web interface for a database and I frequently have to create new HTML forms and SQL queries to work with structured data. We write the SQL directly and parse Haskell records from each row, and write forms so that users can change the data in a particular row. We can model this with a higher-kinded record type.

data OrderF b = OrderF
  { orderId :: b OrderId
  , quantity :: b Int
  , rate :: b Double
  , unitId :: b UnitId
  }

type OrderSelect = OrderF (Const Sql)
type Order = OrderF Identity
-- type OrderFromUser = OrderF FormResult
-- type OrderDisplay = OrderF (Const (WidgetFor app ()))

This form doesn’t let us derive Prairie's Record instance. We’d have to “lower” the type to do so:

data Order = Order
  { orderId :: OrderId
  , quantity :: Int
  , rate :: Double
  , unitId :: Shaped Need UnitId -- ~ Identity UnitId, but for a <select required>
  }

mkRecord ''Order
-- generates:

instance Record Order where
  data Field Order ty where
    OrderOrderId :: Field Order OrderId
    OrderQuantity :: Field Order Int
    OrderRate :: Field Order Double
    OrderUnitId :: Field Order (Shaped Need UnitId)

-- These let us define an `Order` via pattern match on `Field Order ty`
tabulateRecordA :: (Applicative f) => (forall ty. Field Order ty -> f ty) -> f Order
tabulateRecord :: (forall ty. Field Order ty -> ty) -> Order

The function tabulateRecordA is a powerhouse. Look closely at the type of its argument: f is a type constructor that applies linearly across all of the fields of a record. For records with a finite number of fields, this is isomorphic to our higher-kinded data type. Examples here and here.

What use do they have in industry? I am also daydreaming of an alternate Yesod.Form implementation that, among other things, moves the “required” and “multiple” attributes to the type level, and using Prairie would allow extremely concise definitions for what it means to “run” a form for a particular record. Writing a form is as easy as writing a value:

orderForm ::
  (RenderMessage app FormMessage) =>
  (Prairie Order Identity -> Prairie Order (FormInput app))
orderForm (Prairie order) = Prairie \field -> case field of
  OrderOrderId -> FormInput numberPortal [("id", "orderid")] (order field).runIdentity
  OrderQuantity -> FormInput numberPortal [("min", "0")] (order field).runIdentity
  OrderRate -> FormInput numberPortal [("step", "0.01")] (order field).runIdentity
  OrderUnitId -> FormInput (selectPortal _) [] (order field).runIdentity

-- for use with
prairieForm ::
  (Record rec, RenderMessage app FormMessage) =>
  (Prairie rec (FormInput app) -> FormFor app (Prairie rec (FormOutput app)))

Tailor-made interfaces to delicate sections of business logic enable greater separation of concerns. Higher-kinded data gives you the power to use types as templates, and Prairie only needs to generate one GADT to do it. Prairie also gives you literal superpowers for combining records with the same Fields:

components ::
  (Applicative f, Applicative g, Record rec) =>
  (forall x. f x -> g x) ->
  (Prairie rec f -> Prairie rec g)
components nat (Prairie f) = Prairie (nat . f)

zipWith ::
  (Applicative f, Applicative g, Record rec) =>
  (forall x. f x -> g x -> h x) ->
  (Prairie rec f -> Prairie rec g -> Prairie rec h)
zipWith nat (Prairie f) (Prairie g) = Prairie (liftA2 nat f g)

In the language of barbies, components is a way to change clothes, and zipWith is a way to wear two sets of clothes at once.

velveteer · May 1, 2024, 4:12pm

Like any pattern there are tradeoffs. I’ve been using records parameterized by type constructors (HKD) on an enterprise project for years, so I can give you my experience report.

We use Haskell records to model our domain entities in an internal event-sourcing architecture. This means we track updates to these entities in their own type (an event). For this purpose it is very convenient to re-use the entity record types between “create” (all fields wrapped in Identity) and “update” logic (all fields wrapped in Maybe). This is the simplest use case and probably the least contentious (i.e. most boring) application of HKD. These records can be nested and some are large, so boilerplate-reducing classes like those from barbies (e.g. TraversableB, ApplicativeB) become tempting to derive.

As you might infer from previous HKD threads, the pattern starts to break down when you wrap a nested HKD type, need more type constructor parameters, or want some fields to exclude the higher-kinded wrapper. I happen to like the barbies library a lot, but its complicated innards can be a deterrent when things go wrong, or if you are using HKD in a way not prescribed by the library.

The use of a type family (HKD colloquially or Wear from barbies) is usually added as a convenience to avoid dealing with Identity everywhere, but in my experience this is not worth it. Wrapping/unwrapping a newtype is much less painful than the type errors you encounter when every field has a type family on it.

Do the aforementioned libraries make their use ergonomic enough? Are there other commonly used libraries that I’m missing?

I will say barbies is the clear winner here if you can afford the complexity. There is also barbies-th: Create strippable HKD via TH for those who prefer to avoid Generic.

higgledy is a great idea but you lose record syntax along with performance. I would probably only use it for personal projects.

hkd I’ve only used briefly, but it’s pretty solid. You still get Generic-based default instances for classes like FZip, FTraversable, etc. What I think is missing are the “constrained” operations we see in barbies like bmapC, bzipWithC, etc. which can be very useful.

One approach missing from your list uses a GHC plugin instead – GitHub - aaronallen8455/hi-fi: Higher kinded records plugin. I don’t have a strong opinion on whether this is better than using generics or TH. Notably the library prescribes a style for record types in order to work optimally, which is a general theme for HKD records.

On my project we ultimately did not pick any of these libraries. We are rolling our own version of the “descriptor” pattern: The Descriptor. I won’t go into too much detail since the code is proprietary, but our approach allows us to write term-level schemas for records that can be interpreted as a “descriptor”, which gives us the ability to e.g. zipWith, traverse, foldMap (and their constrained versions) our HKD records. This approach does not require any meta-programming or GHC plugins, just the willingness of a developer to write the schema.

Suffice to say, I really like the HKD pattern, and you can do some really cool stuff with it. The problems people have with it are usually when they try to stretch past the prescriptions of the pattern. Sometimes those problems get novel solutions (see Data.Functor.Transformer) but with an additional complexity cost.

lucasdicioccio · May 1, 2024, 6:07pm

I was planning to write a small blog post about the technique so I’ve some well-formed thoughts but not ready for publication. I think HKDs a pretty good answer to “ad-hoc” REST APIs where a resource is encoded in various occurrences/contexts with various overlapping but never identical shapes (e.g., you post an object without an ID, and get back an ID and a large description, but you only get a subset of fields when listing objects, and you need to update with yet another set of fields).

A typical illustration of the above, and where I’ve used them in anger in acme-not-a-joke. I found the boilerplate of “writing my own HKD” pretty minimal versus other approaches (having one data per occurrence with the certainty of hitting duplicated-record-fields, or one gigantic unreadable sum-type with a branch per occurrence).

No experience with the cited libraries but some examples where I’ve poked at was in defining a “modulith” where the runtime environment would depend on some HKD indexed by the role (e.g., a read-only node would not need a write-capable runtime).

mixphix · May 1, 2024, 8:13pm

@lucasdicioccio Your example uses type families to control the types of particular fields, essentially controlling the structure of the datatype itself. That application of HKDs can’t be expressed by the uniform application of a functor; there’s no f such that your Order type can be represented as (forall ty. Field Order ty -> f ty). Your HKD needs the additional metadata present at each of its fields in order to resolve its type, which begs the (genuine) question: what is the benefit of a template if it must change shape? Is the complexity more manageable than using smaller types that only capture what’s necessary, and converting between representations?

rhendric · May 1, 2024, 10:31pm

Many uses of HKD in the wild are only ever used to toggle between Maybe and Identity, or NonEmpty and Identity—simple cardinality restrictions, in other words. In such cases, I tend to favor the following lower-kinded approach for its simplicity:

-- For toggling between 0-or-1 and exactly-1:
data Foo opt = Foo {
  field1 :: Either opt Int,
  field2 :: Either opt String,
  field3 :: Either opt Bool
}
type FooOptionally = Foo ()
type FooDefinitely = forall a. Foo a

-- For toggling between 1-or-more and exactly-1:
data MaybeMore more a = MaybeMore a (Maybe (more, NonEmpty a))
  deriving Functor
data Foo more = Foo {
  field1 :: MaybeMore more Int,
  field2 :: MaybeMore more String,
  field3 :: MaybeMore more Bool
}
type FooPlural = Foo ()
type FooSingular = forall a. Foo a


-- Some helper patterns I usually use for matching on these things:

pattern CompletelyRight :: a -> Either Void a
pattern CompletelyRight a <- (either absurd id -> a)
  where CompletelyRight a = Right a
{-# COMPLETE CompletelyRight #-}

pattern It'sNothing :: Maybe (Void, a)
pattern It'sNothing <- (maybe () (absurd . fst) -> ())
  where It'sNothing = Nothing
{-# COMPLETE It'sNothing #-}

This avoids both giant sum types and representable invalid states, sometimes at the cost of a little extra boxing relative to HKD. It’s pretty manageable.

And how could I forget to mention the best part: the exactly-1 case is a true subtype of the others, as it intuitively ought to be! No conversion, not even a coerce, is necessary to use a FooDefinitely value where FooOptionally is desired. Try that with HKD.

tbidne · May 1, 2024, 10:42pm

I’ll add a voice to the “pro-type families” crowd.

“Strict HKD” – as in, using type constructors – is often too limited to be of much use IME. Either you are restricted to each field using the same type constructor (sacrificing flexibility), or you load up your type with multiple type vars for different constructors as needed (sacrificing ergonomics).

Indexing your type with type families (Trees that Grow), however, is a whole different game. It’s a powerful technique, and shouldn’t be the first thing you reach for, but it can be quite nice when you have several large data types that have nearly identical fields, save for differences on a per-field basis.

I haven’t used any libraries for it though, just rolled my own.

lucasdicioccio · May 2, 2024, 9:26am

The win, I find, is not when looking at the code as an artifact. Rather, I think the win resides around the engineering process of defining field names once in a way that translates the “ad-hoc REST” description of the service I interact with. For my ACME lib, the typical description is rfc-8555. The exact set of fields present in various API calls is somewhere between hard to parse and undocumented (and thus subtle difference could occur depending on the interpretations at different server). One could lament that such specs are not formal enough, but this RFC illustrates well what we have to deal with in the wild. Of course the value of the approach boils down to preference, but I value a lot a code structure which lets me “map the spec into some type” first and then in another place restrict/depart/adapt/refine in an extensible way. That’s why I picked a blunderbuss Field definition which allows to entirely replace the field-type but stills enables a way to convey and start from the information “this is the canonical type if we read the spec”.

tristanC · May 2, 2024, 4:12pm

Not sure if this is a valid use-case, but I gave HKD a try to let the caller pick between [] and Vector when fetching remote data. For what it’s worth, here is the change.

velveteer · May 2, 2024, 5:06pm

It’s certainly valid to call this HKD, but is it HKD according to the pattern we see prescribed by barbies and other libraries? Parameterizing a record by a type constructor is a general pattern, but sometimes we say HKD when we mean every field is wrapped in that type constructor – i.e. the “template” type to which @mixphix refers.

Although I think even this usage can be loosened. Consider Data.Functor.Barbie from barbies, which says each field must be wrapped by the type constructor or must be a Monoid.

chrisdone · May 2, 2024, 9:11pm

I piddled about with a similar idea for forms with HKD here Indexed fields exploration · GitHub (in which I created this cool new technique that everyone had already been using for years… ), I think that’s not been explored much because the web shifted from backend to frontend in the 2010s, so a lot of formlets innovation just stopped. And yesod-form was always a brutally simplified type of formlet. Amusingly all this great insight gained in the Haskell world was ignored while the React world decided to reinvent forms all over again and do it badly.

Inadvisedly I continued working on it with GitHub - chrisdone-archive/forge: Haskell form library as my vehicle, which does address optional vs required fields at the type-level and is quite ergonomic. I also implemented error information flow that goes both upwards and downwards (“floor” and “ceiling”), which I consider novel in the design space, and based on practical need.

Reflex happened since then, which was a bit of a game changer.

I’m generally unconvinced that tying a form’s shape to a record is as flexible as it needs to be in practice, it’s an idea that sounds good on paper but later feels constraining. I did in PureScript because you can do such things easily with records in PureScript and realized it is a very modest win.

Now that Htmx is popular and plain old web backends are in vogue again, we might see a resurgence in formlets as an abstraction and new ideas come up (e.g. using Selective (my “BindForm”) or other things).

olf · May 2, 2024, 9:54pm

Disclaimer: I won’t use HKD in the stricter all fields wrapped in the same rank-1 parameter sense.
One aspect that I, like @mixphix, believe could be a great source of convenience is when HKD is used to define both the data structure and its parsers/forms. The hard part is then to define sensible “glue” that traverses the structure Parser to form a Parser (structure Identity). I have not seen a convincing companion data type that can be used to describe “the space between the fields of a record”, which any sensible UI designer would want fine control over. I believe the difficulty of finding good “glue” is also what @chrisdone observed. An alternative, but even less flexible way, is going via type classes like in aeson or in this experiment. That also suffers from the glue problem, this time in the type class instance deriving.

For business logic I found myself using HKD to unify nearly-identical data structures in the Trees That Grow fashion:

-- before:
class HasHeader t where header :: t -> Hdr
data Transaction  = T   {tHeader  :: Hdr, tPayload   :: Playload}
data Transactions = Ts  {tsHeader :: Hdr, tsPayload :: [Playload],  check :: String}
-- after:
data Transaction' f c = T' {header :: Hdr, payload :: f Payload, check :: c}
type Transaction  = Transaction' Identity ()
type Transactions = Transaction' [] String

Unifying data structures this way in an existing code base can be done with little code change, but thereafter remembering the exact definition of all the type aliases puts extra mental burden on the maintainer.
Upside: Extensible Records in Haskell98 without any language extentions like OverloadedLabels or TypeFamilies. Downside: For HKDs like the above, the standard deriving strategies usually fail, so you have to write more boilerplate. Would I do it again if I were to write all the code from scratch? Probably no.

Ambrose · May 3, 2024, 1:24am

Standalone deriving doesn’t work with those? Like, deriving stock Show would work if you bring the constraints. Unless that’s the boilerplate you’re referring to

mixphix · May 3, 2024, 3:57am

I’m honing in on such a “glue” for Yesod.Form that will work for any Record type. With some minor adjustments to the data structures and the clever composition of Applicative Functors, we end up with a transformation

form :: forall ty. FormEntry app ty -> Compose (FormFor app) (FormExit app) ty

where FormEntry is the data necessary for creating a DOM element/parsing the request, FormExit app ty ~ (FormResult ty, WidgetFor app ()), and FormFor app is the same RWST monad but parameterized by the appropriate app variable (rather than an abstract, convoluted MonadHandler m). The functors on both sides of the function arrow are Applicative, so for a Prairie.Record we have

newtype Tabulate rec f = Tabulate (Field rec ~> f)

idIso :: (Record rec) => Iso' rec (Tabulate rec Identity)
idIso = iso (distribute . Identity) (runIdentity . tabulate)

distribute :: (Record rec, Functor f) => f rec -> Tabulate rec f
distribute frec = Tabulate \field -> getRecordField field <$> frec

tabulate :: (Record rec, Applicative f) => Tabulate rec f -> f rec
tabulate (Tabulate f) = tabulateRecordA f

components :: (Record rec) => f ~> g -> Tabulate rec f -> Tabulate rec g
components nat (Tabulate f) = Tabulate (nat . f)

-- data Entry app ty is the elided, adjusted Yesod.Form.Field equivalent
-- data FormEntry app ty = FormEntry (Entry app ty) [(Text, Text)] ty
type FormEntry app =
  Product
    (Entry app)
    (Product (Const [(Text, Text)]) Identity)
-- data FormExit app ty = FormExit (FormResult ty) (WidgetFor app ())
type FormExit app =
  Product FormResult (Const (WidgetFor app ()))

tabulateForm ::
  (Record rec, RenderMessage app FormMessage) =>
  Tabulate rec (FormEntry app) -> -- (Field rec ~> FormEntry app)
  FormFor app (FormExit app rec)
tabulateForm = getCompose . tabulate . components form

We can automatically tabulate a FormFor app action for an entire record rec, that concatenates the Widget components and checks the FormResult in its proper fashion, just by writing a case statement on the Field rec ty what FormEntry app ty to use.

olf · May 3, 2024, 6:58am

While that is neat, it still does not tell me what the glue is. Record is a type class, so as I see it, it neither falls into the HKD category nor is it essentially different from Generic approaches: Once you have your Record instance in place, you have no control over the final shape of the form produced by tabulateForm. But we’re digressing here from the OP. Instead of the recordFieldLens type class method, with HKD one could say:

data UserHKD f = User {
    _name :: f String,
    _age  :: f Int
    }
makeLenses ''UserHKD

newtype HKDLens s a = HKDLens {runHKDLens :: forall f. Lens' (s f) (f a)}

recordFieldLens :: UserHKD (HKDLens UserHKD)
recordFieldLens = User {
    _name = HKDLens name,
    _age  = HKDLens age
    } -- simple enough to be generated by TH

I expect that the tabulateRecordA can also be derived using one of the existing HKD packages, e.g. Data.Functor.Barbie.btraverse.

olf · May 3, 2024, 7:02am

Indeed it does, but requires UndecidableInstances because the constraint C (f FieldOfHKD) is not smaller than C (HKD f). And yes, the standalone instance is a bit of extra boilerplate.

blamario · May 3, 2024, 1:13pm

You mean like this record function?

blamario · May 3, 2024, 1:29pm

Is your tabulate meant to mirror the Representable’s namesake method? You may be interested in Rank2.Distributive class which provides similar functionality for HKD records. I’ve been using it for parsing like you say.

mixphix · May 3, 2024, 1:59pm

A Record rec is kind of a like a Representable Functor, but its Rep is Field rec. I got tabulate from shortening tabulateRecord but it would not surprise me if that’s where the idea originated. The benefit of Tabulate is the ability to separate the underlying data from the functor, emulating some parts of HKDs without sacrificing automatic deriving. A challenge for Record would be to create another associated datatype, essentially the “representation of the representation”: a plain, non-GADT sum type for the fields of a record, so that we could also define type families that have different values on each field. Then Tabulate rec (SomeTypeFamily (FieldEnum rec)) could, for special type families, also change the field’s type, encompassing even more HKD features.