What are you supposed to do about -Wambiguous-fields?

DavidB · May 16, 2024, 10:22am

The warning GHC-02256 also wasn’t documented yet, so I started a PR to document the warning in the error message index. I currently have one example which uses the solution that @adamgundry proposes, but I am happy to add other solutions as well
The PR is here: Document GHC-02256 by BinderDavid · Pull Request #484 · haskellfoundation/error-message-index · GitHub

I would also be grateful for suggestions on how to improve the explanation of the warning.

MangoIV · May 16, 2024, 11:18am

for playing around:
https://bin.mangoiv.com/note?id=f876d8ce-3a7a-4acf-9ce7-adb348d44b5b

edit: yes I know it should be (name :: Symbol) cause otherwise we don’t get an instance but mmh this is hacked together

tomjaguarpaw · May 16, 2024, 1:03pm

Thanks to @MangoIV and @DavidB I can provide some answers:

OverloadedRecordUpdate is considered experimental.
There is no definitive story for generating getField for OverloadedRecordDot.
There is not even a blessed class for setField, for OverloadedRecordUpdate.

mitchellwrosen · May 16, 2024, 1:36pm

Them’s fighting words Type-directed name resolution is controversial in languages with type inference because it generates an ugly chicken-egg problem

One unintuitive thing for me: why is type inference involved at all? In my example, I should think that the type checker could use my user-written type signature as a constraint: bar : Bar.

blah :: Bar -> Bar
blah bar = bar { foo = 5 }

Rather, this ambiguous nugget seems to be considered in isolation at some point, generating the warning.

{ foo = 5 }

But I don’t need GHC to work so hard; I want to assert what type the record is!

DavidB · May 16, 2024, 3:43pm

The proposal DuplicateRecordFields without ambiguous field access by adamgundry · Pull Request #366 · ghc-proposals/ghc-proposals · GitHub contains a pretty good discussion about the design space. For example DuplicateRecordFields without ambiguous field access by adamgundry · Pull Request #366 · ghc-proposals/ghc-proposals · GitHub gives good arguments why type directed name resolution doesn’t work so well with the way GHC works.

As another example: There is a recent bug report #23992: `$` converts ambiguous record update error to a warning · Issues · Glasgow Haskell Compiler / GHC · GitLab which shows that your blah bar = bar { foo = 5 } currently works, blah bar = id bar { foo = 5 } does not work, but blah bar = id $ bar { foo = 5 } works again. It is just not very transparent and understandable when GHC can do the type-directed name resolution and when it can’t.

george.fst · May 21, 2024, 11:08am

I suspect that when the proposal to add the warning was written, it wasn’t anticipated that development of OverloadedRecordUpdate (then part of the proposed RecordDotSyntax along with what is now known as OverloadedRecordDot) would take so long.

This has led to an unfortunate state of affairs where we’re warned of the need to migrate before the ideal migration strategy actually exists, since OverloadedRecordUpdate is still experimental, and in practice basically unusable due to the need for manual setField instances.

So, like others, I’ve ended up just disabling the warning everywhere I’ve seen it so far, on the assumption that the DuplicateRecordFields behaviour won’t actually change before OverloadedRecordUpdate has been fully implemented. And I would strongly urge any GHC developers on this thread to ensure that’s the case, since anything else would be a very annoying regression.

I fear this whole situation is going to confuse everyone who hasn’t totally kept up with the status of the many record extensions. Which, anecdotally, seems to be almost everyone.

tl;dr I don’t think there’s currently a satisfying answer to the actual title of this thread. (EDIT: at least in the record-update case which is what we’re discussing - the same warning also covers ambiguous selection, but there are better workarounds for that)

tomjaguarpaw · May 21, 2024, 1:00pm

We have indeed been given that guarantee:

https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0366-no-ambiguous-field-access.rst#transition-period

george.fst · May 21, 2024, 1:21pm

Thanks for digging that out! I’m still unsure about the wisdom of introducing this warning, given that most current code will be fine when OverloadedRecordUpdate arrives, but that’s at least reassuring.

tomjaguarpaw · May 21, 2024, 3:19pm

Yes, it does seem a bit strange to have the warning whilst you can’t do anything about it.

simonpj · May 21, 2024, 8:35pm

This is a long thread! Would anyone feel able to summarise the consensus (if one exists) about what you’d like to happen in GHC?

MangoIV · May 22, 2024, 3:14pm

I hope I don’t summarize that incorrectly in that I say that:

the current situation of telling the user to migrate to disambiguation based on module qualification doesn’t work and that one would expect it to work based on types
we would all like to use HasField as a replacement that respects the technicalities of GHC but it’s not usable as the replacement it’s supposed to be at the current state as the implementation of OverloadedRecordUpdate hasn’t progressed enough, in particular by forcing RebindableSyntax and not generating setField for the user
there needs to be a migration strategy for users who cannot implement the recommendation the warning suggests at the moment

Please add on to this or correct me if I summarized incorrectly.

jackdk · May 22, 2024, 3:48pm

Somewhat disagree: I am satisfied by the explanations that allowing type-based record disambiguation creates a large and nasty wart in the way information flows through the compiler. Personally, I quite like proposal #537, because it uses existing syntax, doesn’t leave the big wart in GHC, and lets me write forward-compatible versions of record updates in amazonka-* service packages without having to ship 300+ new sdists to Hackage.

Personally, I’ve never been that excited about any of the recent record extensions. Old-school records were unergonomic, but lenses are so much more versatile that I consider them well worth the learning curve.

MangoIV · May 22, 2024, 5:05pm

That’s fair, I was summarising the discussion up to that point.

adamgundry · May 22, 2024, 8:06pm

My memory is that proposal #537 stalled because there wasn’t a clear winner between two options:

Permitting the renamer to look at type signatures in limited situations, so e.g. (r :: T a b c) { f = x } would unambiguously pick out the f field of type T. This would be backwards compatible, but is potentially verbose where T has many type parameters, and it may feel rather restrictive that type signatures have to be placed in very specific places to be used when disambiguating.
Permitting type names to be used as module qualifiers, rather like LocalModules, e.g. r { T.f = x }. This is not backwards compatible or necessarily obvious to use, and may lead to awkward corners (e.g. if T is a module name as well as a type name) but it avoids writing out the type parameters.

I think either of these is plausible in principle, but the details need working out and we need to somehow reach consensus. It would be great if someone could help drive that discussion forward.

On the OverloadedRecordUpdate/setField front, I put quite a bit of effort into nailing down the design in proposal #583, which has been accepted, and I’m fairly happy with where that ended up. So it’s now a Small Matter of Programming to actually implement it. Unfortunately I’m not sure I currently have bandwidth to do so, but if anyone would like to take this on, I’d be happy to advise.

Correcting one misapprehension from earlier: automatic constraint solving for HasField(getField) (used by OverloadedRecordDot) is fully implemented in GHC and I think it is in reasonable shape. The only obvious thing missing in released GHCs is support for representation polymorphism, which I’ve recently implemented in GHC HEAD.

AntC2 · May 22, 2024, 10:15pm

Thanks Adam, I agree with your “wasn’t a clear winner”. (At risk of bikeshedding about mere lexing) I know the idea is to use dot-suffix to give record access the same flavour as in other languages, but it’s not a good fit with Haskell in general:

What’s more, if we’re updating more than one field, it’s even less obvious:

r { T.f = x, T.g = y }         -- repeat the T. everwhere?
r { T.f = x, g = y }           -- T. implicitly repeated?
r { f = x, T.g = y }           -- retrospectively repeat the T. ?

Inside { ... } we’ve quite a bit of lexical freedom. (For comparison, inside [ ... ] we can put list comprehensions.) Maybe

r { T | f = x, g = y }
r { T a b c | f = x, g = y }
--         ^^ ample opportunities for bikeshedding as to which separator

I appreciate you’re needing someone to drive this. I’m not volunteering because I’ve always had a distaste for Haskell’s approach to record syntax; I find nothing to love about any of the attempts to work round the H98 restrictions. I simply don’t use record field names outside { ... }.

MangoIV · June 24, 2024, 2:02pm

I have added an issue on the GHC bugtracker about reformulating / removing the warning on ambigous record fields.

darkxero · June 25, 2024, 12:06am

About your example:

module Some.Namespace.VeryLongModuleName where
import Some.Namespace.VeryLongModuleName as SomeSemanticallyFittingName (A(..))
import Some.Namespace.VeryLongModuleName as SomeOtherSemanticallyFittingName (B(..))
data A = MkA {a :: Int} 
data B = MkB {a :: Int} 
x :: B -> B
x y = y {SomeOtherSemanticallyFittingName.a = 3}

You can mix and match imports:

import Some.Namespace.VeryLongModuleName (A, B)
import qualified Some.Namespace.VeryLongModuleName as SomeSemanticallyFittingName hiding (A(..), B(..))
import qualified Some.Namespace.VeryLongModuleName as A (A(..))
import qualified Some.Namespace.VeryLongModuleName as B (B(..))

x :: B -> B
x y = y {B.a = 3}

and leave SomeSemanticallyFittingName for other functions you see fit. It doesn’t make sense to name the qualified B import as SomeSemanticallyFittingName if you’re importing B to use as B anyways.

-- if you had this
import qualified Some.Namespace.VeryLongModuleName.VeryLongAbstractFactoryPatternName as VeryLongAbstractFactoryPatternName (VeryLongAbstractFactoryPatternName(..))
-- you could get away with
import qualified Some.Namespace.VeryLongModuleName.VeryLongAbstractFactoryPatternName as FactoryB (FactoryB(..))
-- with proper namespacing and isolating crucial records on one record per file basis (like public classes in Java)

Another alternative:

{-# LANGUAGE RecordWildCards #-}

x :: B -> B
x (MkB {..}) = MkB {a = 3, ..}

For short types, prefixing fields saves you the imports, but for long types, smart module imports win:

data VeryLongThing = MkT {veryLongThingName :: String}

...

-- redundant synonym made later by user
vltName = veryLongThingName

-- vs

data VeryLongThing = MkT {name :: String}

import Module.Thing (VeryLongThing)
import qualified Module.Thing as VLT (VeryLongThing(..))

...

VLT.name

-- still have the possibility of synonym
vltName = VLT.name

There might be an opportunity to do some bikeshedding to make imports more palatable like allowing something like:

import Some.Namespace.VeryLongModuleName (A, B)
    qualified as SomeSemanticallyFittingName hiding (A(..), B(..))
                 A (A(..))
                 B (B(..))
                 Everything
                 C hiding (C(..))

I think making naming more convenient is a good way to go because implementing SetField is HELL, not even optics can save it, you need to remake the system from the ground up with optics (I wouldn’t mind).

tomjaguarpaw · June 25, 2024, 7:58am

Why is this?

konn · June 25, 2024, 10:41am

Might be rather off-topic, but I wonder how polymorphic update on multiple fields can be formulated with OverloadedRecordUpdate, or more generally, lens or optics.

Consider the following:

data MyData a = MyData { foo :: a, bar :: Maybe a }

To update MyData a to MyData b, we must update both foo and bar simultaneously. This prevents ones from using per-field approach such as setField or lens/optics.

I think one way is to make the result of partial update to the partially updated type, perhaps something similar to McBride’s Clown-Joker derivatives. One downside is the need of some kind of generic programming, and perhaps it sacrifices the efficiency.

Is there any discussion on this kind of type-changing bulk-updates? I think we need to achieve this kind of update to replace existing record syntax with NoFieldSelectors and OverloadedRecordUpdates.

darkxero · June 25, 2024, 12:05pm

There are many caveats depending on what you set out to do.

If you want to support type changing updates, now the functional dependencies solutions might have trouble with type inference and would need a lot of type annotations.
If you want to support multiple type changing updates, it’s even worse.
If you forego functional dependencies you have to do either magic on GHC, introduce new first class anythings, or go for type families. Compilation times might increase unacceptably.
Depending on the magic needed you might need to forego supporting duplicated record names because maybe avoiding being constrained by a typeclass and just doing a function is better.
If you still want to get all the goodies and decide to reimplement a new solution. Does it run better? Does it run acceptably on big projects? Does it have other problems? Can it do everything and more? And more importantly, can old records be warped to become the new structures without breaking backwards compatibility? You can turn your records into optics prisms, lenses, and traversals, why is everybody waiting for the panacea? Backwards compatibility.

And I’m not even bringing extensibility up.

If SetField is promised as the “it will solve everything” when stable, it’s a very tough problem to crack if you don’t want to compromise on anything (retaining old capabilities, type inference, ergonomics, backwards compatibility).

I do not envy Adam on this.