Idea: record syntax for type parameters

Intro

I’ve noticed a common problem cropping up in some of the projects I’ve worked on, regarding heavily-parameterised datatypes with several type aliases for different use-cases.
I’ve also had the concept of a solution percolating around in my head for a while now, so I wanted to get it out and see if anyone has any thoughts on it.

Context

A common pattern in large Haskell codebases is product types that are fully parameterised, for example:

data User' id username password birthday friends = MkUser
  { userId :: id
  , userUserName :: username
  , userPassword :: password
  , userBirthday :: birthday
  , userFriends :: friends
  }
  deriving (Show, Eq, Generic)

We would then define type aliases for different situations:

type User =
  User'
    UserId
    UserName
    ()
    Day
    [UserName]

type UserCreate =
  User'
    ()
    UserName
    PlainTextPassword
    Day
    [UserName]

type UserPatch =
  User'
    ()
    (Maybe UserName)
    (Maybe PlainTextPassword)
    (Maybe Birthday)
    (Maybe [UserName])

Why?

The reasons for doing this, rather than a bunch of independent data declarations, are several:

  1. We can re-use code, for example:

    countUserFriends :: User' id username password birthday [UserId] -> Int
    countUserFriends = length . userFriends
    

Then we can use countUserFriends with User, UserCreate etc.

  1. All the type aliases can share instances. For example if we use deriveJSON, the generated JSON instances will apply for User, UserCreate, UserPatch etc.:

    deriveJSON (defaultOptions { omitNothingFields = True }) ''User'
    
    ghci> encode (MkUser () "michaelscott69" "1234" (YearMonthDay 1965 March 15) [] :: UserCreate)
    {
      "userUserName": "michaelscott69",
      "userPassword": "1234",
      "userBirthday": "1965-03-15",
      "userFriends": []
    }
    ghci> encode (MkUser () Nothing Nothing Nothing (Just ["dwightschruteARM"]) :: UserPatch)
    {
      "userFriends": ["dwightschruteARM"]
    }
    

Note that omitNothingFields will only omit fields for parameterised types with aeson >= 2.2.

  1. We can easily convert between the aliases:

    makeUser :: UserCreate -> IO User
    makeUser userCreate = do
      id <- genId
      pure $ userCreate { userId = id, userPassword = () }
    
  2. It makes it very easy to use database libraries like Opaleye:

    makeAdaptorAndInstanceInferrable' ''User'
    
    type UserSql =
      User'
        (Field PGUserId)
        (Field PGUserName)
        (Field PGPasswordHash)
        (Field SqlDate)
        () -- many-to-many relationship handled by a separate table
    
    userTable :: Table UserSql UserSql
    userTable = table "users" . pMkUser $
      MkUser
        { userId = tableField "id"
        , userUserName = tableField "username"
        , userPassword = tableField "password_hash"
        , userBirthday = tableField "birthday"
        , userFriends = pure ()
        }
    
    insertUser :: Connection -> UserCreate -> IO User
    insertUser conn userCreate = do
      newId <- genId
      passwordHash <- hashPassword $ userPassword userCreate
      let userToInsert = userCreate { userId = newId, userPassword = passwordHash, userFriends = () }
          userSql :: UserSql = toFields userToInsert
      runInsert conn $ Insert userTable [userSql] rCount Nothing
      pure $ userCreate { userId = newId, userPassword = () }
    

Problems

However, there are some drawbacks to this approach. For one thing, as the domain grows, the list of type parameters can become very long and difficult to work with:

data User' id username password birthday friends paperSold bossesSleptWith favouriteResort ......

countUserFriends :: User' id username password birthday [UserId] paperSold bossesSleptWith favouriteResort ...... -> Int
countUserFriends = length . userFriends

Moreover, for every general function like countUserFriends that only needs a subset of the parameters to be specified, every time we add a parameter to User', we have to update the function. Of course GHC is nice enough to tell us exactly where we need to do this, but it’s still a pain, and it’s not totally error-proof if we accidentally get the parameters mixed up.

Another issue is that it becomes harder to use ad-hoc variants of User'. In the Opaleye snippet above, let’s say we wanted to create an ephemeral type alias (or just an inline type annotation) to represent the argument to toFields. This would be very similar to User, with one or two differences, but we still need to recreate the whole parameter list:

type UserToInsert =
  User'
    UserId
    UserName
    PasswordHash
    Day
    ()
-  let userToInsert = userCreate { userId = newId, userPassword = passwordHash, userFriends = () }
+  let userToInsert :: UserToInsert = userCreate { userId = newId, userPassword = passwordHash, userFriends = () }

Again, every time we change User', User or UserSql, we’ll also have to update UserToInsert.

One DRY solution to this is to use partially-parameterised types:

type User'' password friends =
  User'
    UserId
    UserName
    password
    Day
    friends

type User = User'' () [UserName]

type UserToInsert = User'' PasswordHash ()

The problem again is that as the domain grows, and we need more and more different variants of User'', the intersection of all their
parameters becomes smaller and smaller until we end up with a chaotic situation like:

type User'''' id username password friends =
  User'
    id
    username
    password
    Day
    friends

type User''' id password friends = User'''' id UserName password friends

type User'' password friends = User''' UserId password friends

type User = User'' () [UserName]

type UserCreate = User''' () PlainTextPassword [UserName]

type UserToInsert = User'' PasswordHash ()

type UserPatch = User' () (Maybe UserName) (Maybe PlainTextPassword) (Maybe Birthday) (Maybe [UserName])

This quickly becomes unmaintainable.

What can we do about it?

I think that a lot of these issues will go away once we have native first-class row types in Haskell. In the meantime, I imagine a language extension that would solve at least some, if not all of the problems I’ve highlighted.

I’m picturing standard value-level Haskell record syntax, lifted to the level of type parameters.

Explicitly, type parameters in data declarations would now have names, with (optional) kind annotations:

data User'
  { id :: Type,
  , username :: Type
  , password :: Type
  , birthday :: Type
  , friends :: Type
  } = MkUser
  { userId :: id
  , userUserName :: username
  , userPassword :: password
  , userBirthday :: birthday
  , userFriends :: friends
  }

Type aliases could now (optionally) be written using record contructor syntax:

type User =
  User'
    { id = UserId
    , username = UserName
    , password = ()
    , birthday = Day
    , friends = [UserName]
    }

Types could now also be defined using record update syntax, with the exact same semantics as value-level record update:

type UserCreate =
  User      -- note this refers to the type alias, not the data declaration
    { id = ()
    , password = PlainTextPassword
    }

In theory, we could also have some kind of record accessor syntax:

type UserPatch =
  User      -- note this refers to the type alias, not the data declaration
    { id = ()
    , username = Maybe (username User)
    , password = Maybe (password User)
    , birthday = Maybe (birthday User)
    , friends = Maybe (friends User)
    }

Of course, type aliases could still be written using the normal syntax:

type UserSql =
  User'
    (Field PGUserId)
    (Field PGUserName)
    (Field PGPasswordHash)
    (Field SqlDate)
    ()

Somehow (maybe maybe), we could imagine a kind of record pattern matching at the type level:

countUserFriends :: User' { friends = [UserName] } -> Int
countUserFriends = length . userFriends

Maybe even inline record updates, with bindings?

addUserId :: user@(User' {}) -> user { id = UserId }
addUserId u = do
  id <- genId
  pure $ u { userId = id }

Caveats

I’ve never worked on GHC before. I have absolutely no idea how easy or even possible any of this would be to implement. I would imagine that the more basic aspects would be syntactic sugar, but maybe others would be more serious changes.

I’m also not sure how any of this ties in with ongoing GHC development, such as Dependent Haskell, Linear types, or row types.

Questions for the reader

I hope I’ve communicated this clearly enough.Maybe there’s an existing solution that I just don’t know about, or some best practices I can use to avoid the problems I outlined?

Otherwise, it’d be interesting to know if anyone’s tried implementing this before; if not, whether people think it would be a good idea.

Anyway, thanks for reading :slight_smile:

10 Likes

If you’re willing to stomach some mildly ugly surface syntax, you can absolutely use record-like things as type parameters in today’s GHC and get most if not all of what you’re talking about. Here’s a quick and dirty demo: Haskell Playground

Row polymorphism only comes into play when you want to do things like abstract over which fields are present, or have one record type subsume another with a subset of fields. Here, it’s the same set of fields present for every User type; you are just swapping out their types. GHC’s data kinds are entirely sufficient for that.

There are definitely improvements that could be made to the syntax, though! In particular, I’d be curious to see a proposal for a type-level equivalent to OverloadedRecordDot, so that the p.:"fieldName" in my demo could become p.fieldName.

6 Likes

Thanks @rhendric, this is a cool trick I hadn’t considered. It definitely solves some of the problems - ButWith is especially nice.

countUserFriends :: (r:."friends" ~ [Int]) => User' r -> [Int]
countUserFriends = userFriends

Nice.

But there are a few things I feel are missing. For one thing, if we look at the type aliases:

type User = User' (Int :+ String :+ String :+ String :+ [Int] :+ EmptyRecord)

This still means that the parameters have an ordering, which is something I forgot to mention in my original post. With a syntax like the following, the order of the parameter assignments doesn’t matter:

type User =
  User'
    { id = UserId
    , friends = [UserName]
    , username = UserName
    , birthday = Day
    , password = ()
    }

Also, I’m not sure how to do the equivalent of the following with your solution:

--  proposed syntax
addUserId :: user@(User' {}) -> user { id = UserId }
addUserId u = do
  id <- genId
  pure $ u { userId = id }

The following doesn’t compile:

assignId :: Int -> User' r -> (User' r) `ButWith` '[ "id" ::: Int ]
assignId x u = u { userId = x }
    • Couldn't match type: Lookup
                             ["id" ::: *, "username" ::: *, "password" ::: *, "birthday" ::: *,
                              "friends" ::: *]
                             (Replace
                                ["id" ::: *, "username" ::: *, "password" ::: *, "birthday" ::: *,
                                 "friends" ::: *]
                                r
                                "id"
                                Int)
                             "username"
                     with: Lookup
                             ["id" ::: *, "username" ::: *, "password" ::: *, "birthday" ::: *,
                              "friends" ::: *]
                             r
                             "username"
      Expected: Replace
                  ["id" ::: *, "username" ::: *, "password" ::: *, "birthday" ::: *,
                   "friends" ::: *]
                  r
                  "id"
                  Int
                :. "username"
        Actual: r :. "username"
      NB: ‘Lookup’ is a non-injective type family
    • In a record update at field ‘userId’,
      with type constructor ‘User'’
      and data constructor ‘MkUser’.
      In the expression: u {userId = x}
      In an equation for ‘assignId’: assignId x u = u {userId = x}
    • Relevant bindings include
        u :: User' r (bound at Main.hs:74:12)
        (Some bindings suppressed; use -fmax-relevant-binds=N or -fno-max-relevant-binds)
   |
74 | assignId x u = u { userId = x }
   |                ^^^^^^^^^^^^^^^^

Error messsages also aren’t too friendly!

Separate question: I notice that you’re using forall in a sort of DH-way here:

type Lookup :: forall (kvs :: [STPair]) -> Record kvs -> Symbol -> Type
type family Lookup kvs r k where

I’m not super up to date on the status of DH, but I thought using forall like this was an upcoming GHC future. Yet in the playground it will compile fine as far back as GHC 9.6.7. Am I missing something here?

side note, this is also the problem that “trees that grow” supposedly solves. See: haskell playground. At least partially.

4 Likes

This syntax “visible forall” has been part of the type language for a while now. The signature you’re looking at is a “standalone kind signature”. In contrast to the term language, Haskell had a dependent type language for a while now, so you can write things like

data Foo = Bar | Baz 
type family (foo :: Foo) :: Type

The goal of dependent Haskell is to unify type and term language more and more. A very recent (ghc 9.10) step in that direction was the introduction of -XRequiredTypeArguments 6.4.19. Required type arguments — Glasgow Haskell Compiler 9.15.20250903 User's Guide

which allows you to write

x :: forall a -> a -> a

Mind that this is still not a dependent type, since the a may not be mentioned on the term level, it’s rather a replacement for something like x :: forall a. Proxy a → a → a

If you want to track the progress of dependent Haskell, @int-index who is the main person currently working on dependent Haskell has made a writeup of what needs to be (and has already been) done:

5 Likes

That’s fixable too, with a bit more machinery that I elided to try to keep the demo simple. But if you add this:

type UninitRecord :: forall (kvs :: [STPair]) -> Record kvs
type family UninitRecord kvs where
  UninitRecord '[] = EmptyRecord
  UninitRecord (k ::: _ : kvs) = Undefined :+ UninitRecord kvs

type Undefined :: a
type family Undefined where

type Uninitialized (f :: Record fields -> Type) = f (UninitRecord fields)

then you can write, in any order:

type User = Uninitialized User' `ButWith`
  [ "id" ::: Int
  , "friends" ::: [Int]
  , "username" ::: String
  , "birthday" ::: String
  , "password" ::: String
  ]

This isn’t the only way to do it. All I’m doing here is reusing heterogeneous list ideas that have been around for a while; check out the HList package for one of the oldest inspirations.

Oh, you want addUserId to be fully polymorphic in the rest of the User'?

I had

assignId :: Int -> UserCreate -> User
assignId x u = u { userId = x }

but to make that polymorphic, we can add some more machinery to introduce the right type equalities. I had to polish and refactor my demo code to allow this, but in this version, I have:

assignId :: Substitute r '[ "id" ::: Int ] r' => Int -> User' r -> User' r'
assignId x u = u { userId = x }

(Row polymorphism might help here to remove the constraint, but as long as the constraint can be propagated up to a use site where the types are concrete, it’s not necessary. And at least some implementations of row polymorphism — I’m most familiar with PureScript’s — would still require some constraint on a function like this in order to restrict the row type to admissible rows.)

That’s a common problem with type-level programming, to be sure!

To be clear, this isn’t a recommendation that you go out and use the thing I bodged together in your important software projects. The main point I want to make is that GHC’s type system is already rich enough to allow the things you want, so if you’re going to make a proposal to improve the ergonomics in the surface language or in the compiler’s error messages, (1) you won’t need any changes to the type system, and (2) you can prototype your ideas in a library and then specify your proposal in terms of things desugaring to Haskell code you can write today.

3 Likes

This feature request reminds me that I need to add promotion of record syntax to the DH task graph. Maybe we could co-author a proposal if you’re interested.

6 Likes

Cool, was record syntax promotion already part of the plan? Does it look anything like what I outlined above?

This is a great POC. You’re right, I’m not sure if I’d use it in a real project, but it’s good to know that I’m not proposing a huge evolution to GHC’s power :slight_smile:

1 Like

It’s not something I gave much thought to, but it generally falls under the “promote all term-level syntax” umbrella. So if we have

{-# LANGUAGE OverloadedRecordDot #-}
data R = MkR { foo :: Int, bar :: Bool }
x = MkR { foo = 10, bar = True }
y = x { foo = x.foo + 1 }

then we should also have

type X = MkR { foo = 10, bar = True }
type Y = X { foo = X.foo + 1 }

That’s a natural consequence of promotion.

Now, if I understand your idea correctly, it takes this a step further, adding record syntax to type constructors themselves. To me, it feels like a natural extension. As an approximation, I tend to think of type constructors as constructors of an “open” declaration for Type, i.e.

data Type where
  Int    :: Type
  Bool   :: Type
  Maybe  :: Type -> Type
  Either :: Type -> Type -> Type
  Const  :: forall k. Type -> k -> Type
  ...

And of course, these are all valid standalone kind signatures, so

type Const :: forall k. Type -> k -> Type
newtype Const a b = Const {getConst :: a}

And if we support record syntax in GADTs, then we might as well support it in SAKS, thus

type User' :: { id, username, password, birthday, friends :: Type } -> Type
data User' id username password birthday friends =
  MkUser ...
6 Likes

Hi @int-index, sorry for the delay - busy couple of months.

Now, if I understand your idea correctly, it takes this a step further, adding record syntax to type constructors themselves. To me, it feels like a natural extension.

Exactly - everything that we can do with records at the term-level, I’d like to be able to do at the type-level (with type parameters instead of record fields). So I agree that this is a natural consequence of “promote all term-level syntax”.

Maybe we could co-author a proposal if you’re interested.

I’d definitely be interested. Bear in mind though that I’ve never contributed to GHC or DH, so a certain amount of hand-holding might be necessary! How would we proceed?

3 Likes

Great to hear that! The first step would be to write a proposal and submit a PR to GitHub - ghc-proposals/ghc-proposals: Proposed compiler and language changes for GHC and GHC/Haskell.

The most important sections are “Motivation” that presents the feature and explains the need for it, and “Proposed Change Specification” that defines how the language grammar, scoping rules, or typing rules should change.

It’s ok to start with a draft and refine it over time as the discussion progresses. Once the proposal is ready, it can be submitted to the GHC Steering Committee for review.

3 Likes