Record systems: absence of variants aka sum types

AntC2 · June 19, 2024, 8:27am

Yeah, I notice many of the anonymous records systems papers talk about sum types but then never implement them. Section 2.1 Var vs Rec here, for example. I can see a few work-rounds, which make it probably not worth the complexity:

Put the Variant as an ordinary ADT within a labelled field’s type: {event = Key 'c'} :: {event :: Action Char} vs {event = Mouse x y} :: {event :: Action Float}.
Embed the record within a GADT: Event {key = 'c'} :: Action (Rec {key :: Char}) vs Event {mouse = (x, y)} :: Action (Rec {mouse :: (Float, Float)}).
Use a different (set of) field label(s) for each variant: { key = 'c' } vs { mouse = (x, y) }.

But yeah now you need heavyweight machinery. From a quick Google, purescript is aware of the value of GADTs, but they’re not there yet(?) They offer some work-rounds.

Variants by (set of) label(s) needs variant introduction/elimination type-level functions, sez section 6.1 at that link. But all we have is a typeclass/instance mechanism. I have built that in Hugs/Trex, more as an experiment to show how heavyweight it is than as a workable approach.

mikeplus64 · June 20, 2024, 1:48am

In TypeScript, where all user-defined types are structural (discounting tricks that can emulate nominal types), the easiest way to discriminate different variants imo is a tag field like

enum EventTag { Mouse, Keyboard }
type Event
  = { tag: EventTag.Mouse, event: [x: number, y: number] }
  | { tag: EventTag.Keyboard, key: number }

If you have an Event the tsc compiler is quite good at knowing what variant it is once you have inspected the tag. But tsc is quite good at inferring what variant you have based on whatever unique characteristics each variant has - like typeof it.key === 'number' would satisfy it that you’re looking at a keyboard event. You could equally use classes and the instanceof operator.

AntC2 · June 21, 2024, 2:08am

Thank you @mikeplus64. If I can tease out what’s going on in your example, TypeScript is dynamically typed (following JavaScript); typeof appears in expressions/dynamic despatch?

I’m looking at this manpage on Union Types, although Google took me first to a now-deprecated page that shows dynamic despatch via switch.

Haskell is statically typed, so we don’t just want the compiler to be “quite good at inferring”; it must be absolutely certain via type inference at compile time of the type of every value.

The type decl is merely giving a name for some type (like Haskell type synonyms). The interesting part is the | forming a type union. Although the manpage doesn’t use the term, that’s what I’d call a ‘discriminated union’: each option within the Union must have some unique characteristic that you can test for at run time. Is that sufficient to make those Variants? Let me ask:

I’m not seeing what the tag is achieving. IOW could I define

type Event
  = { mouse: [x: number, y: number] }
  | { key: number }

Then test for which option by switching by the label of the record?

If I do include the tag as well, does the type system make sure I use the appropriate label (event vs key in your example) pairing with the value of the tag?

typeof it.key === 'number'

(oh, three equal signs) Before I go looking at it.key, I need to be sure it contains a field labelled key. If I mess that up, will the compiler tell me in advance/reject the program; or only at run-time?

AFAICT from the academic papers, the hard part with Variants for statically-typed languages is both keeping track of all the possible labels/types forming the union and making sure in consuming the type that all possible variants are considered, and their contents appropriately handled.

So note in the various discussions for records systems in Haskell, we’re assuming the label names form part of the type of the value. (Contrast H98 record syntax for ADTs: the type restricts which labels might appear, but isn’t so smart telling you’ve used fieldname x on a value with constructor Key that doesn’t have an x field.)

mikeplus64 · June 21, 2024, 7:07am

It’s statically typed – all types are erased by the compiler which basically serves to check and remove the type annotations – but with very easy escape hatches into unsafeCoerce equivalents. Static, but not always sound. It has some type-level features that would make even the most experienced type-family-abusers blush .

typeof is a JS operator that can tell you what primitive type (if any) a value is. It is like an extremely limited Data.Typeable.typeOf. Checking the typeof of a sum type, or a shared field of a sum type, is one way of narrowing down what variants it could be. E.g. given a value x: {tag:0,value:string}|{tag:1}, the type of x.tag is 0|1. x.value is a type error - you can’t know there is a value field without discriminating somehow.

The tag is doing what GHC does for you with regular ADTs: Like in Haskell code at runtime each constructor must have its own unique tag, and pattern-matching will check those tags to decide what branch to proceed on, you can do similar in TS by making your own tag.

You can discriminate by different variants of a union in many ways in TS. TS Playground - An online editor for exploring TypeScript and JavaScript here is an example I’ve cooked up where different ways of dispatching.

BurningWitness · June 21, 2024, 9:23am

Hence:

data STag = SMouse
          | SKeyboard

data Tag a where
  Mouse    :: Tag SMouse
  Keyboard :: Tag SKeyboard

data Event a where
  MouseEvent
    :: Rec { tag :: Tag Mouse, x :: Float, y :: Float }
    -> Action SMouse

  KeyboardEvent
    :: Rec { tag :: Tag Keyboard, key :: Char }
    -> Action SKeyboard

Although the tag doesn’t even need to be stored, it’s a property of the event:

toTag :: Event a -> Tag a
toTag (MouseEvent    _) = Mouse
toTag (KeyboardEvent _) = Keyboard

Ultimately it seems like:

Having fully fledged variant support would render ADTs mostly pointless;
Figuring out what | is represented by at runtime would be non-trivial implementation-wise and would probably require additional user input.

So why not stop in the middle? ADTs always bake in runtime information to distinguish sums, row-polymorphic record types without variants never need to distinguish sums.

This should be more than enough for immutable data, and I don’t think variants would matter much for mutable things (representing C unions would be impossible anyway).

AntC2 · June 21, 2024, 10:11am

Thanks again Mike, I’m still working through your latest. To avoid this getting derailed …

Well, no: ADTs are values within the same type. (My first bullet in the first post.) Variants – at least those with differing field labels are at different types [**]. The only way Haskell can handle those currently is via the class/instance mechanism; but that turns out too much different: classes are open (and indeed some instances might be declared in a different module); we want ADTs to be closed/so we can be sure a case has all the possibilities covered.

Or are you saying you want True/False to be different types? GADTs are somewhere in between: different values within the same type determining different types within a match. Note that with stand-alone records we don’t have label determining type.

[**] Assuming ‘first class’ stand-alone extensible records; not Haskell ADT’s pseudo-records.

BurningWitness · June 21, 2024, 10:55am

Ah, okay, so this entire discussion is in the context of extensible records, which brings in even more questions about the runtime representation of everything (and about communicating it).

My question is a simple “what would variants help me do that I wouldn’t be able to otherwise”, and I guess the answer would be “being able to describe sums in an input argument in terms of constraints”, which still seems incredibly niche.

AntC2 · June 22, 2024, 4:57am

It seems from your ‘cooked up’ examples (thank you), the tag is doing more than that: what GHC does with a GADT pattern match; matching ‘improves’ the inferred type of the payload.

Furthermore merely matching on the field label if ('mouse' in ev) { ... also yields that improved type. That is, providing there’s a suitable (Variant/Union) type declared for ev. " The general principle is this: type refinement is only carried out based on user-supplied type annotations ." At a guess, that’s where the Trex 1994 design was headed (link in my o.p.). Note that was long before GADTs appeared in GHC (~2006).

Yeah. This dude seems to be determined to break the typing then patch it up with more and more band-aids, including generics. He’d probably prefer to be coding in Octal, I suspect. What puts me off is the language complexity and sheer number of different ways to achieve more-or-less the same thing. (Or does each way turn out to be subtly different?)

Conclusion: for these purposes GHC provides GADTs. Not “compiler-supported row-type-based sum types” [from o.p.] because there are no first-class/stand alone row-types. Whether PureScript would/should introduce GADTs or drive type improvement from row-types remains to be seen.

enobayram · June 22, 2024, 6:22am

I’m not too familiar with TS either, but my understanding is that their approach to designing their type system is fundamentally very different from Haskell. Typescript’s type system was designed to be able to keep existing common idioms in JS and have them checked by a type checker.

This is very different from Haskell where the language was carefully designed so that the syntax naturally supports the advanced type system and as users, we carefully structure our code to make it type check, whereas a TS developer is prone to yelling “I already checked 37 lines ago that x is a string, why doesn’t the compiler know that here?” without concerning themselves with what kind of type system would be necessary in order to propagate that information through all those syntactic forms.

I occasionally see an announcement for a new typescript version that touts yet another strange thing the type checker can do, like, say this one and think how much they’re spoiling the programmer trying to recover intentions through contorted ways of writing code, while the whole thing could easily be avoided by structuring your code slightly differently so that type checking it is much easier (for humans and compilers).

As I said I didn’t spend much time with Typescript, so I really don’t know whether those strange things actually increase expressivity, i.e whether they let you type useful programs that genuinely need those ad-hoc type checker features and are too impractical/ugly/convoluted etc. to express with the Haskell philosophy. My gut feeling tells me that they only serve to spoil the programmer without really increasing expressivity, but I really don’t know.

AntC2 · June 22, 2024, 9:31am

Yeah. A lot like putting lipstick on a pig. If I could suggest what might make things easier …

Making variables immutable would help quite a bit.

enobayram · June 22, 2024, 9:37am

TBH, I share your sentiment, a good type system should force me to think clearly before anything else, which is the opposite of trying to dance around the programmer to recover their intention. But I find a dismissive attitude to be dangerous, because there are some very smart people behind Typescript and they’re trying something very different. Having a dismissive attitude you could risk being blind sided if they are actually onto something good regardless of how wrong the approach appears to us.

enobayram · June 22, 2024, 9:40am

Oh, and yeah, I think that’s a very common theme, many of the hard problems of the mainstream languages seem to just evaporate when you embrace immutability.

atravers · June 27, 2024, 4:37am

[…] a dismissive attitude […]

No, it’s just a simple matter of not enough available time: it just isn’t possible for one person to thoroughly investigate every single new project as it appears. Furthermore, “having some very smart people […] trying something very different” is no guarantee of success e.g. BitC and IEEE 1394 (FireWire).

So the risk of “being blind sided” that way is actually quite low…and most of us don’t want to be “living on the edge” (for regular/commercial software development) anyway. So just keep stomping those bugs out ;-D

atravers · June 27, 2024, 4:48am

If only Ericsson released Erlang as open source earlier: maybe then the folks at Netscape could have done something really different!

AntC2 · June 27, 2024, 6:44am

Yeah. ‘The weakest link’ is that they’re building on Javascript: " multi-paradigm, supporting event-driven, functional, and imperative programming styles."[wp] And deliberately confusingly named by some brain-dead marketing ‘wizard’ at Netscape.