How to do user facing records in 2024

Say you’re writing a library and you need to expose records in the interface…

data Foo = Foo
  { _fooX :: A
  , _fooY :: B
  }

Say you know you will end up with records within records, so you know lenses/optics are a thing that will come in handy for the user.

Foo as written is written in the style that was definitely a recommend practice when I first started using Haskell, back when GHC 8 was new. It avoids duplicate record fields, and fields start with underscores so makeLens can generate lenses that take the un-underscored names.

However, these days there are all manners of language extensions around records, but many seem opinionated. They’re fine when your working on your own thing but their use often enforces an opinion on users.

I have no idea what combination of language extensions is actually good practice for user-facing records in modern Haskell.

DuplicateRecordFields

Should I just use DuplicateRecordFields and write:

data Foo = Foo
  { _x :: A
  , _y :: B
  }

…while knowing full well there will be other records with an _x field?

Obviously this still leaves record selectors as ambiguous, along with record updates, sometimes, but it will also lead to name clashes when generating lenses with makeLens.

I could use lens's TH tools for generating classy lenses, but that doesn’t work for non-simple lenses and polymorphic records. Non-simple lenses for polymorphic fields can’t share names.

I could use optics instead and use their OverloadedLabel tech, I think this works with polymorphic records, but then I’m forcing the user into the opinionated choice of optics over lens.

NoFieldSelectors

Field selectors kind of suck! If I ditch field selectors then I can define Foo as:

data Foo = Foo
  { fooX :: A
  , fooY :: B
  }

And still use fooX and fooY as an identifier for lenses. That is neat.
Obviously you can also use this in conjunction with DuplicateRecordFields , and in fact this prevents issues with ambiguous duplicate selectors…

But then I’m forcing on the user the opinionated choice of not having field selectors. If they want them (for some confounding reason) then it is annoying work to write them by hand.

If I don’t turn on NoFieldSelectors it is not possible to export or import the fields without the selectors.

If I turn on NoFieldSelectors but then write selectors anyway, the user can choose whether to import selectors… but then I have to manually write selectors!

At least with NoFieldSelectors on, field selectors are easy to define (or replace) with OverloadedRecordDot's (.field) syntax.

OverloadedRecordDot

Overloaded record dot syntax is great in that I can use it without forcing it on others, and I don’t have to do anything to make sure they can use it…

But suppose I want to use NoFieldSelectors after all, but I don’t want to export manually made selector functions. Only doing this would make my records harder to use records.

You need either lenses/optics or OverloadedRecordDot to get back the ability to refer to record fields without record pattern matching…

But suppose the user of my library is a beginner, fresh out of Learn You A Haskell, expecting there to be field selector functions. They won’t understand lenses and they won’t know the OverloadedRecordDot extension. I need to document how to actually use my records with examples if I want my library to be accessible to these sorts of people.

Do I tell the user to use OverloadedRecordDot with examples? Or do I tell them to use an optics library, with examples. Which optics library do I recommend? Opinions!

Optics

Suppose some of my records need to have their constructors and fields hidden to maintain some invariant.
Now the user can’t make optics for my types if I don’t do it, and I can’t split off the optics definitions to a separate package because they need access to the record constructors.

lens adds a lot of dependencies, but actually I can hand-write most optics without any lens dependencies.
Alternatively I could use microlens-th to generate lenses with a lot less dependencies than lens.

Or I could add an optics dependency and use that library to generate optics optics.

Actually, you can turn lens style VL optics into optics optics… so you could just generate VL optics and let the user convert them to optics optics if they’re needed…

But does that come with a performance cost? Does anyone know?

If I want to provide optics for my user, what kind of optics should I provide? Opinions!

Record sums

NoFieldSelectors means no partial selector functions when you define sums of records like:

data Foo 
  = Foo { x :: A, y :: B }
  | Bar { x :: A } 

That’s cool… but record updates and OverloadedRecordDot syntax can still be partial.

So, should we still not mix sum types and record syntax even with NoFieldSelectors on?

Best Practices?

What are the best practices for modern Haskell for public libraries that are friendly to both the beginner and the seasoned Haskeller, that works for the broadest majority of use-cases?

Should I provide selectors or turn off field selectors and embrace OverloadedRecordDot with both hands?
Or instead of OverloadedRecordDot should I take a lens-first approach?

Should I prefix record fields with the type/constructor name so I don’t have trouble with lenses and polymorphic records, or should I use DuplicateRecordFields and hand-write prefixed lenses for the polymorphic fields?
Or should I use optics and OverloadedLabels?

Or is the best practices circa 2024 still just:

data Foo = Foo
  { _fooX :: A
  , _fooY :: B
  }
20 Likes

As a library author I would not be forcing users into NoFieldSelectors for the reasons you mention. My current preference is to name record fields sensibly, without any kind of prefix, and derive a Generic instance that can be used with generic-lens: Generically derive traversals, lenses and prisms. or generic-optics: Generically derive traversals, lenses and prisms.. This usually implies that I’m enabling DuplicateRecordFields. Admittedly this is not a one-size-fits-all solution.

I do push users towards optics, but mainly because record updates are not as ergonomic with this approach. With duplicate record fields I also expect users to qualify their imports if using field selectors, but this isn’t foolproof. Optics do help here, but I still want users to have options.

This is a good question, though. I am also very curious about how others approach this problem today.

5 Likes

This. E.g. amazonka library did it with their update to 2.0 and it turned out well.

This is redundant for typical record use, optics has built-in support for fields as lenses and constructors as prisms.

I don’t do that yet in public libraries (I definitely do it in applications though), but I will start in the near future. NoFieldSelectors is great for reducing namespace pollution and available since 9.2.1 which was released 2.5 years ago.

5 Likes

Why not? The user can still make optics based on the public API.

This is still what I use because all the other approaches seem to have too many caveats, but it’s possible I just haven’t found the sweet spot yet.

I have been pondering this situation for some time for lsp-types, which exports a huge number of auto-generated record types, so needs a consistent policy (since a machine is going to apply it). The current approach is _-prefixed field names; DuplicateRecordFields (rather unavoidable, since the source has many duplicate field names, and I want to provide combined modules that export lots of things); and makeClassy from lens.

The alternative I am most attracted to is the one described by @velveteer : remove the _ prefixes, don’t provide any lens definitions, and let downstream use generic-lens. This would let us drop the lens dependency, which would be nice, and would also let us drop the _ prefixes, which have always felt artificial to me. In practice, people often still use the field selectors, which is unnecessarily ugly because of the underscores. There is an issue with type families also, see Consider switching to microlens or overloaded dot syntax? · Issue #465 · haskell/lsp · GitHub for discussion. Maybe I should just pull the trigger, though.

Now the user can’t make optics for my types if I don’t do it, and I can’t split off the optics definitions to a separate package because they need access to the record constructors.

I agree with @tomjaguarpaw : if you export a getter and setter, then the user can make a lens from them. If you don’t… then presumably they shouldn’t be able to write a lens, or they’ll break your attempt to hide the fields!

6 Likes

I tend to prefer -XOverloadedRecordDot with -XDuplicateRecordFields. I would also like to use -XNoFieldSelectors but it’s a lot of tedious of work to convert a codebase that started pre-9.0 to one that uses dot syntax instead of selectors.


Here’s the thing about optics: people usually define them with TemplateHaskell. You could write a different templating function than makeLens to use the RecordDot syntax, i.e. to produce code that looked like this:

fooY = lens (.y) (\foo y -> foo{y})

Then you avoid the need for selectors even here. Even the choice of templating function is opinionated!

Another common pattern in Haskell libraries is to provide the core datatypes and functionality from one package, and then various wrapper libraries that export the lens/microlens/optics flavour of choice. It’s more maintainer effort, but helps keep downstream dependency trees tailored!


One thing I find unfortunate about record dot syntax is that it’s not available for nullary constructors. You can’t say

instance HasField "message" (Maybe String) String where
  getField = \case
    Just msg -> msg
    Nothing -> "no message"

bad = Nothing.message

because Nothing gets parsed as a module name!

2 Likes

I don’t think makeLenses uses selectors. The documentation says it just uses the constructor positionally:

e.g.

data FooBar
  = Foo { _x, _y :: Int }
  | Bar { _x :: Int }
makeLenses ''FooBar

will create

x :: Lens' FooBar Int
x f (Foo a b) = (\a' -> Foo a' b) <$> f a
x f (Bar a)   = Bar <$> f a
y :: Traversal' FooBar Int
y f (Foo a b) = (\b' -> Foo a  b') <$> f b
y _ c@(Bar _) = pure c
2 Likes

How does GitHub - ndmitchell/record-dot-preprocessor: A preprocessor for a Haskell record syntax using dot fit into the landscape here?

(I am also interested in knowing how to start a new project in 2024 – assuming I can rely on GHC 9.4 or even 9.6)

Note that today you can do this IMO more pleasantly using additional public sub-libraries. I think this is a very natural way to do these “shim-for-using-my-package-with-package-X” little packages that otherwise proliferate.

1 Like

is there even a non opinionated option? every option will make a part of the interface awkward you can’t support OverloadedRecordDot, record selectors and every optics library simultaneously, that said there is an option where there is maximum freedom, use NoRecordSelectors and provide a Lens for every library you want to support in a separate sublibrary, any one that wants a record selector can easily get one by using OverloadedRecordDot

Hmm, it seems like the take-away here is that there is no agreed upon best practice yet.

DuplicateRecordFields seems to be liked though!

Ah, as in actually using the multi-library package feature for once?

It hadn’t even occurred to me that you could use such a feature to let users opt in to more dependencies for more features without a second package.

Well, you can… just not necessarily first-class support. Either your fields or your lenses need to have a prefix, so you have to pick a favorite there. OverloadedRecordDot is I think pretty unintrusive and doesn’t get in the way of the other options it would seem, but it’s a bit of work to show examples of how to use all three in your documentation, so it will be very tempting to pick a favorite to recommend.

Good point… in my zeal to bring up as many problems I can think of with interacting record extensions I may have imagined one :sweat_smile:

Ah, there’s another thing. When you use DuplicateRecordField do you put all your records into separate modules so selector functions can always be disambiguated by qualified imports, or do you just throw records with duplicate fields in the same module, let the compiler generate selectors and offer them “as is” and leave it up to the user to find an alternative solution when field selectors are ambiguous?

I suppose if duplicate field selectors are defined in the same module then maybe users can avoid ambiguity with:

import M (Foo, Bar)
import M qualified as Foo (Foo(..))
import M qualified as Bar (Bar(..))

I’ve only just thought of doing that…

Yeah I usually have duplicate record fields enabled for the same reason that @michaelpj mentioned, where I have a top-level module as a namespace that imports multiple modules under it, and it’s likely there are duplicate selectors within that namespace.

How users get around ambiguity is up to them in this case. I still expose the child modules if they want to avoid importing the entire namespace. Your example is also an option, and I think it’s generally a good practice.

I think the “just record selectors and Generic instances” is the non-opinionated option. You export something that is pretty much “normal-Haskell”:

  • There are record selectors
  • They have normal names (you don’t prefix them)
  • They can be used as normal (including with OverloadedRecordDot if your users want)
  • You don’t depend on any optics library since you’re not providing any optics
  • Users who want optics can get a pretty good experience still using generic-lens or generic-optics

Yes, this is IMO one of the key usecases for multiple libraries. e.g. lsp-types has a sublibrary for the quickcheck instances, so we can publish them together but people who don’t want them don’t need to incur the quickcheck dependency. It’s great.

4 Likes

There’s a snag when trying to use sublibraries to make optics dependencies optional.

Ideally I’d want to make use of optics-th's templates for creating lenses for my records, and I’d also like to be able to use opticsLabelOptic tech to refer to these lenses with overloaded labels.

Alas, this requires defining instances of the LabelOptic class, which the templates do for you, but to put these instances in a sub-library they have to be orphan instances.

Is it just not that big of a deal in practice that I should define them anyway?

I wouldn’t worry about orphan instances in that case.

Maybe a more worrysome snag is that due to this cabal-bug Per-component dependency solving · Issue #4087 · haskell/cabal · GitHub essentially I don’t think the dependency on optics in such a sublibrary is actually optional.

As an additional data point: I still use the regular. _foo names + lens (sometimes makeLenses or makeClassy, or hand written lens functions). I find the optics approach to use % instead of . too noisy.

1 Like

I don’t understand why sublibraries per se are beneficial here, rather than just having separate libraries.

This is a bigger topic, but I think basically they’re just easier.

  • One cabal file, so you don’t have to repeat metadata and you can use common stanzas across them
  • One version, so you don’t have to think about how to version them independently
  • One package, so you don’t have to release and upload them separately

Flipping it around: why would you use a separate package when you have a sublibrary? The main thing that a separate library gets you is a separate version… but often you don’t need or want that.

(There are of course still tooling issues, which are legitimate reasons to avoid them, but conceptually I think they’re pretty great.)

3 Likes

As a follow-on idea to this, you could expose your getter-setter pair in a private optics-compat sublibrary or similar and then have mything-lens and mything-optics just construct the native optics with the constructors (e.g., lens, in both libraries). In this way, you can hide the particularities from a user while also providing the tool support. If I remember how OverloadedRecordDot works, you could even use these to implement HasField instances and provide opt-in support for that syntax as a library at the cost of some semi-orphans.

Amazonka has similar concerns (generating lots of records from service definitions), exports records with no leading underscore and no other prefix, and I think it works well there. This needs -XDuplicateRecordFields in the modules where the record is defined (for GHC <= 9.6) and in any module which collects and re-exports them (for GHC >= 9.8, if you do that).

Library clients are expected to use whatever record technology they prefer. The Generic instance allows generic-lens/generic-optics, but because all the constructors are exported normal selectors/updates and dot syntax are also usable.

5 Likes