Haskell records in 2025 (Haskell Unfolder #45)

Is this a planned feature, is this already happening in GHC?

IIRC, roc claimed to be able to do this.

1 Like

It’s not even planned yet. The first thing that needs to be done is defining what it even means to mutate a field in place: https://github.com/ghc-proposals/ghc-proposals/pull/8. That would be easy to integrate with an explicit approach that uses linear types, but automatic approaches are much more difficult.

The challenge is how to know that there is only one reference. Compile-time analysis are not accurate enough (unless we would do whole program compilation) so a run-time analysis would be needed, but that is quite complicated and also adds its own overhead. So, I don’t expect this to be added soon.

2 Likes

Then there’s the matter of implementation - from issue# 25105: Add MutInt# - GHC GitLab:

To exclude any chance of nondeterminism, mutability must be used in a sequential context. Back in 1996, there was only one (standard) choice for providing that - the monadic interface.

But almost thirty years later, there’s also the arrow and co-monadic interfaces, amongst others - will mutable record fields be expected to support all existing and future means of abstract sequencing? If so, working out what’s required would certainly be an interesting challenge…

Ha! I think both those using OverloadedRecordDot and those using Lenses want to think of employee.person_bio.person_name.lname as a qualified name. Intruding `o` or space-surrounded person_name . lname gets in the way of that reading.

It’s not too hard to define the degree sign ° U+00B0 as compose, and use it on the one machine. But trying to share it across different machines/operating systems usually doesn’t go well. (Isn’t UTF-8 supposed to clear that up?)

Hard to tell quite why, but even by 1999 I guess there was too much legacy code using tightfix . as compose.

The difficulty with trying to support both usages of tightfix . is:

infixr  9  .              -- dot as compose
infixl 10 ( )             -- invisible space as function apply
infixl 11 _._             -- OverloadedRecordDot, as a pseudo-operator
infixl 12 _[ ]{ }         -- suffixed { } with optional invisible whitespace as pseudo-operator

So until the lexer can figure which variety of dot, it can’t tell which sub-expression it applies to, for tackling Type-directed resolution.

Perhaps take the whole of employee.person_bio.person_name.lname as one big undigested blob; don’t try to interpret it until getting to the end? (That’s already what the lexer does with qualified names.) IIRC the Lens usage has all the segments function-type (as arguments to compose), whereas RecordDot needs the leftmost segment to be a datum. Except the leftmost might be an expression/not simple name.

“Ha!” nothing: those two groups can then just hide Prelude.o and adopt . as their own operator.


It’s not too hard to define the degree sign ° U+00B0 as compose […]

…as if the current predicament of . (the full-stop symbol) alone wasn’t already enough of a problem:

As next year will make it thirty years since they officially appeared in Haskell (ver. 1.3)…it’s well past time to see if the Haskell community can devise something better; something that the majority of Haskellers can actually tolerate.

So approve a proposal to deprecate and remove the old H 1.3 record notation by 2036, even if that means Haskell won’t have “standardised” records at all. Considering the overwhelming expression of malaise (or malice) towards the old notation seen here and elsewhere, then no-one ought to care if it was removed permanently at that time!

Orthogonally to all the speculation above, I have a q about using OverloadedRecordDot


Compare these two styles

{-# LANGUAGE  NamedFieldPuns, RecordWildCards   #-}

foo, bar :: Player -> blah
foo p = ...
            ... p.pos.x ...
            ... p.pos.y ...
        ...
            ... p.pos.x ...
            ... p.pos.y ....
        ...

bar (MkPlayer{ pos = pos@(MkPoint{ .. }), .. })     -- puns for x, y and all fields of Player
      = ...
            ... x ...
            ... y ...
        ...
            ... x ...
            ... y ....
        ...

In the foo form, can we expect CSE for the repeated access to x, y? Or the in effect four accesses through p.pos?

The bar form is more verbose at the pattern match, but then it’s binding only once x, y (and all other fields).

On CSE (Common Subexpression Elimination), I tried this with -ddump-simpl:

foo :: Player -> (Int, Int)
foo p = ( p.pos.x * p.pos.y, p.pos.x + p.pos.y )
{-# NOINLINE foo #-}

foo' :: Player -> (Int, Int)
foo' p = let (x, y) = (p.pos.x, p.pos.y)
         in  (x * y, x + y)
{-# NOINLINE foo' #-}

and had:

-- RHS size: {terms: 34, types: 23, coercions: 0, joins: 0/0}
Main.$wfoo [InlPrag=NOINLINE] :: Player -> (# Int, Int #)
[GblId, Arity=1, Str=<L>, Unf=OtherCon []]
Main.$wfoo
  = \ (p_s1rS :: Player) ->
      (# case p_s1rS of { MkPlayer ds_dUK ->
         case ds_dUK of { MkPos ds1_dUM ds2_dUN ->
         case ds1_dUM of { ghc-prim:GHC.Types.I# x1_aVP ->
         case ds2_dUN of { ghc-prim:GHC.Types.I# y1_aVS ->
         ghc-prim:GHC.Types.I# (ghc-prim:GHC.Prim.*# x1_aVP y1_aVS)
         }
         }
         }
         },
         case p_s1rS of { MkPlayer ds_dUK ->
         case ds_dUK of { MkPos ds1_dUM ds2_dUN ->
         case ds1_dUM of { ghc-prim:GHC.Types.I# x1_aVX ->
         case ds2_dUN of { ghc-prim:GHC.Types.I# y1_aW0 ->
         ghc-prim:GHC.Types.I# (ghc-prim:GHC.Prim.+# x1_aVX y1_aW0)
         }
         }
         }
         } #)

-- RHS size: {terms: 24, types: 17, coercions: 0, joins: 0/2}
Main.$wfoo' [InlPrag=NOINLINE] :: Player -> (# Int, Int #)
[GblId, Arity=1, Str=<LP(LP(ML,ML))>, Unf=OtherCon []]
Main.$wfoo'
  = \ (p_s1rN :: Player) ->
      let {
        ds_s1r7 :: Int
        [LclId]
        ds_s1r7
          = case p_s1rN of { MkPlayer ds1_dUI ->
            case ds1_dUI of { MkPos ds2_dUK ds3_dUL -> ds2_dUK }
            } } in
      let {
        ds1_s1r9 :: Int
        [LclId]
        ds1_s1r9
          = case p_s1rN of { MkPlayer ds2_dUI ->
            case ds2_dUI of { MkPos ds3_dUN ds4_dUO -> ds4_dUO }
            } } in
      (# GHC.Num.$fNumInt_$c* ds_s1r7 ds1_s1r9,
         GHC.Num.$fNumInt_$c+ ds_s1r7 ds1_s1r9 #)

but was not sure if that is the step by which GHC would have been clever.

Re virtual fields/the awkwardness of defining your own HasField instances, going via a pattern synonym might be less awkward. (YMMV seeing the number of extensions it needs)

{-# LANGUAGE  PatternSynonyms, ViewPatterns #-}
{-# LANGUAGE  NamedFieldPuns, RecordWildCards #-}

pattern PosD :: Float -> Pos
pattern PosD{ d } <- ((\(MkPos{ x, y}) -> sqrt $ fromIntegral x ^ 2 + fromIntegral y ^ 2)
                                  -> d )

Now you can use d as a virtual field on pos. (You don’t have to mention PosD.)

That’s a uni-directional pattern. I guess with a bi-directional, you could use the virtual field for update.


(Somewhat related) Re the nesting for those short-and-sweet types that are intimately related to their host type (like Pos within Player, or Person_name within Person_bio):

  • It would be nice to ‘flatten’ them into the host type, to avoid a level of indirection on access or especially update.
  • Then provide a pattern to make them available as a virtual subtype
    pattern PosV{ x, y } = MkPlayer{ x_ = x, y_ = y}
    (where x_, y_ are the underlying names in the data Player decl)
  • Needs more work: although that looks bi-directional, it doesn’t mention other fields of Player so can’t be used to build a whole record.
1 Like