Ad-hoc polymorphism erodes type-safety

NorfairKing · August 29, 2023, 2:28pm

https://cs-syd.eu/posts/2023-08-25-ad-hoc-polymorphism-erodes-type-safety

simonpj · August 29, 2023, 2:50pm

Interesting. See Section 1.3 of GHC’s style guide

tomjaguarpaw · August 29, 2023, 3:03pm

So what’s going on here is that allowListSize was morally

allowListSize :: Foldable [] => Settings -> Int
allowListSize = length . settingAllowList

and it changed to

allowListSize :: Foldable Maybe => Settings -> Int
allowListSize = length . settingAllowList

but because the type class constraint is automatically resolved we don’t get to see this change, become suspicious, and deal with the problem.

Since FTP length is a particularly common culprit. I suggest using length @[] (or whatever) wherever you use it!

DavidB · August 29, 2023, 3:07pm

In private conversations I usually call this the “invisible diff” problem, because this concerns the part of the diff that you usually don’t see when you review a PR on github or gitlab. But it is a diff which could theoretically be made visible by appropriate tooling: The diff only has to be done on the level of the desugared Core instead of on the textual diff.

If I take your example, then the following two versions of the function are generated during compile time (taken directly from godbolt.org):

-- Before
allowListSize :: Settings -> Int
allowListSize = . (length $fFoldableList) settingAllowList

vs

-- After
allowListSize :: Settings -> Int
allowListSize = . (length $fFoldableMaybe) settingAllowList

It might be quite a bit of work to write this tooling, but it should theoretically be possible. You would only need both versions of the desugared Core available, and run a normal textual diff on that.

hellwolf · August 29, 2023, 3:45pm

What does type safety mean here?

I only smell the desired outcome that is “refactoring safe”… not sure you can get that from type system alone, unless using additional machinery that is dependent types (dependent haskell) or ~~gradual types~~ refinement types (liquid haskel)?

Edit: Or we do what all industrial programmers do, write functional test cases if your type system usage is not there yet to guarantee your spec.

jaror · August 29, 2023, 4:02pm

Liquid Haskell uses liquid types which are a form of refinement types.

hellwolf · August 29, 2023, 4:32pm

Thanks for correcting, it was my confusion.

AntC2 · August 30, 2023, 2:12am

Thank you @NorfairKing: food for thought.

Yes, FTP was controversial at the time (and probably still is). For example:

Prelude> length (1, 2)
===> 1
Prelude> length (1, 2, 3)

<interactive>:5:1: error:
    * No instance for (Foldable ((,,) Integer Integer))
        arising from a use of `length'

But you could define instance Foldable for all sorts of tuples.

IMO length should have remained restricted to lists (all those tutorials!); FTP should have introduced a new scaryOverloadableLength for Foldables.

So are there examples of the dangers of refactoring misleadingly failing to throw type errors, other than with length? The article says

can occur for with every class/trait that has more than one instance/impl.

Ok. Are there classes other than Foldable with instances for so many generic data container types?

ChShersh · August 30, 2023, 9:51am

In my years of experience, I developed a rule of thumb of when an abstraction erodes type-safety.

I call typeclasses like Foldable structure-oblivious because they don’t preserve the structure of the argument in their result. That’s why they’re potentially dangerous because if you add more layers on top of your argument you have no way to leverage the type checker to catch potentially wrong behaviour earlier. So you must rely on other tools to verify your code (tests, LiquidHaskell, manual code review, etc.).

Examples of structure-oblivious abstractions:

length :: Foldable f => f a -> Int
show :: Show a => a -> String
toJSON :: ToJSON a => a -> Value

In contrast, some typeclasses are structure-preserving. The type of the structure is preserved in the result. That’s why all types propagate. If you change a value from settingAllowList :: [ClientIdentifier] to settingAllowList :: Maybe [ClientIdentifier], functions like fmap fromClientIdentifier settingAllowList don’t compile anymore.

Examples of structure-preserving abstractions:

(<>) :: Semigroup a => a -> a -> a
fmap :: Functor f => (a -> b) -> f a -> f b

I noticed, when there’s a bug due to a usage of a typeclass, it’s usually because the typeclass is structure-oblivious.

chreekat · August 30, 2023, 11:45am

Nice, and I can intuit why this might be the case! Sounds like the beginning of a HLint rule…

chrisdone · August 31, 2023, 8:46pm

I suppose GHC could warn about any method call which is not fully polymorphic and the definition of which has a given type variable (mentioned in the class head) on the left hand side of the arrow but not on the right hand side. On a superficial reading, that seems to define these structure discarding methods. Is there more to the story?

rhendric · August 31, 2023, 9:46pm

How does HLint or the compiler determine the difference between

middle of some code somewhere (length tree) more code

where tree is a particular Foldable data structure, and

lengthTree :: Tree a -> Int
lengthTree = length

which I infer is the ‘responsible’ way to use this sort of method?

michaelpj · August 31, 2023, 10:19pm

I would be interested to know how many people have actually had a bug caused by this (and of those, in how many cases it was exactly that one Foldable tuple footgun). I don’t think I ever have.

tomjaguarpaw · September 1, 2023, 6:45am

HLint can’t. It doesn’t have type information. GHC can because it does. GHC case can see that in the first example length is used at polymorphic type (I presume that’s what you meant) and lengthTree is used an monomorphic type.

In fact, stan already has a rule for this. (And now for your regular PSA that stan doesn’t support GHC 9.x but I have a branch that adds support up to GHC 9.4).

tomjaguarpaw · September 1, 2023, 6:45am

know how many people have actually had a bug caused by this

I have, a few times over the years, I think.

rhendric · September 1, 2023, 6:49am

I might be misunderstanding, but isn’t the original issue that length was used on one monomorphic type ([ClientIdentifier]), and then the code changed so that it was being used on another monomorphic type (Maybe [ClientIdentifier]) without the programmer realizing? If monomorphic-type-good is the rule, then neither of these would be flagged. If polymorphic-type-good, then how do you ever use a polymorphic function without generating warnings (this is what I was trying to ask in my previous post)?

tomjaguarpaw · September 1, 2023, 6:53am

Oh, I see, you mean how does it tell that lengthTree is a “good” usage? And it’s more a question about the semantics of “good” usage than the mechanics of how tooling detects it?

(I should add a correction about stan: I think what it detects is certain “bad” monomorphic usages, over (,) a particularly, and probably others.)

rhendric · September 1, 2023, 7:01am

Yeah, I guess it’s a question about semantics. I was responding specifically to

And I think either there is more to the story or I just don’t understand what’s being proposed, because sometimes you’re going to want to use a function like length on a not-fully-polymorphic type, so what’s the plan for allowing that when it’s ‘correct’, and what does ‘correct’ even mean?

Ambrose · September 1, 2023, 4:37pm

catamorphism considered harmful

chrisdone · September 1, 2023, 8:00pm

I think the proposal of the post was that you’d use Tree.length or List.length rather than Foldable.length when you know you’re working with a tree or list, for example.

Topic		Replies	Views
Seeking feedback on language editions proposal Announcements	14	2456	December 23, 2024
GHC+DH Weekly Update #6, 2023-06-07	8	2052	June 10, 2023
GHC+DH Weekly Update #7, 2023-06-14	1	1087	June 14, 2023
Rerefined: Refinement types, again (refined rewrite) Show and Tell	2	694	April 30, 2024
Violating memory safety with Haskell's value restriction Links	6	726	May 26, 2025

Ad-hoc polymorphism erodes type-safety

Related topics