Collectings occurences of incomplete patterns to make them available to Haddock

Kleidukos · December 28, 2022, 11:08pm

I feel like this is a relatively “easy” win that we can take from Idris (without having to solve the halting problem). From the Idris documentation:

By default, functions in Idris must be covering. That is, there must be patterns which cover all possible values of the inputs types. For example, the following definition will give an error:
fromMaybe : Maybe a -> a
fromMaybe (Just x) = x
This gives an error because fromMaybe Nothing is not defined. Idris reports:
frommaybe.idr:1:1--2:1:fromMaybe is not covering. Missing cases:
        fromMaybe Nothing
You can override this with a partial annotation:
partial fromMaybe : Maybe a -> a
fromMaybe (Just x) = x
However, this is not advisable, and in general you should only do this during the initial development of a function, or during debugging. If you try to evaluate fromMaybe Nothing at run time you will get a run time error.

– Types and Functions — Idris2 0.0 documentation

Beyond the usual benefits, it powers the documentation generation tool:
Screenshot_2022-12-27_at_20-45-42_Data.List

We can’t yet carry proof for pre-conditions that would guarantee totality in a function that is partial, but we can certainly start by exposing this meta-data about a function: Has the function triggered a -Wincomplete-patterns warning?

A similar ticket was opened on the GHC tracker but I’m also interested in broader community opinion on this.

f-a · December 28, 2022, 11:51pm

This might be quite useful but I suspect it carries a number of false negatives:

-- partial function
partialOne :: [a] -> a
partialOne (a:_) = a

-- still partial function
partialTwo :: [a] -> a
partialTwo as = head as

λ> :load /tmp/prova.hs
[1 of 1] Compiling Main ( /tmp/prova.hs, interpreted )

/tmp/prova.hs:2:1: warning: [-Wincomplete-patterns]
    Pattern match(es) are non-exhaustive
    In an equation for ‘partialOne’:
        Patterns of type ‘[a]’ not matched: []
  |
2 | partialOne (a:_) = a
  | ^^^^^^^^^^^^^^^^^^^^
Ok, one module loaded.
λ>

Same for using error in the catchall pattern.

Some checking is definitely better than nothing, we would just need to precisely specify what we can assure to the user.

jvanbruegge · December 29, 2022, 12:11am

Yes, ideally it should also look for transitive partiality and report that. Like partial because of use of head and then link to the haddock of head

Tarmean · December 29, 2022, 7:00am

Without an abstraction barrier I don’t think a transitive partiality analysis is super useful. If all rust code which uses a vector op transitively had to be marked unsafe it wouldn’t be a very helpful marker.

With a hadock annotation to mark functions non-partial I do think a transitive analysis could be really useful, especially if it lists all exception types. But that may be another feature entirely?

Kleidukos · December 29, 2022, 11:09am

Regarding the false negative, could be perhaps be helped by referential transparency? With a set of “guaranteed” partial functions that we ship, could it be possible to infer the cases of direct inheritance of partiality (where the function can trivially be replaced by its partial component, without lets or wheres)?

tomjaguarpaw · December 29, 2022, 12:49pm

I think I’d much rather have a language-level (as opposed to documentation-generator-level) solution to this. One possibility is a Partial => constraint that gets added to the result type of all partial pattern matches. (As I understand it, PureScript has something like this.)

Kleidukos · December 29, 2022, 12:52pm

Yes PureScript has it. In practice, you check out of that by using unsafePartial, which effectively discharges the constraint. I’d love to ask @hdgarrood if the PureScript ecosystem has suffered from it?

jaror · December 29, 2022, 12:53pm

With laziness, transitive inference becomes a pretty difficult. Consider:

f n = let xs = [1..n] in zipWith (+) xs (tail xs)

This function is safe even if n < 1, because in that case the partial tail xs function would never be evaluated.

I guess we could be satisfied with an approximation that just marks everything as partial if there is any doubt at all. I don’t know how useful that would be.

hdgarrood · December 29, 2022, 2:05pm

Previously: Use HasCallStack and error in GHC.List and .NonEmpty · Issue #5 · haskell/core-libraries-committee · GitHub

Generally I think it works really well, but there are a couple of caveats:

it’s tough to explain how to use the Partial class correctly; in my experience people tend not to understand where they should propagate a Partial constraint and just slap an unsafePartial on in situations where they really shouldn’t. Of course, people can write whatever they want in their own projects, and it’s easier to ensure that important libraries are at least using Partial correctly
similarly, there’s a risk of people seeing Partial constraints and incorrectly thinking that the type checker will guarantee that any function lacking a Partial constraint is guaranteed to be total.

I suspect the main barrier to introducing Partial to Haskell would be backwards compatibility though: if an un-discharged Partial constraint doesn’t cause a compilation failure, then realistically there will be tons of functions that lack Partial constraints but should have them, which limits the utility of the constraint; if an un-discharged Partial constraint does cause a compilation failure then you’ll break basically everyone’s code and people will be very upset.

Kleidukos · December 29, 2022, 6:04pm

Thank you very much for this explanation (and reference to the other topic).

chrisdone · December 29, 2022, 7:45pm

My latest thinking on this on a Haskell-like Unison-like (CAS) language of my own is that all compiler warnings (including use of division by zero, throw, etc) generate the Partial constraint, to make sure it bubbles up, but it is hidden from type signatures and cannot be discharged via unsafePartial. Finally, the warning, any call site, can be marked as “off” for that particular call site. You can only do this well with a CAS language, due to changes. The compiler or the IDE would maintain a list of on/off messages for that user.

I think this strikes a nice balance. It keeps you honest but doesn’t bother you.

Kleidukos · December 29, 2022, 10:58pm

To be honest I believe that’s one of the best compromises, putting it at the type-level may be too harsh of a step to introduce the concept at first. I’d prefer to have it appear in the documentation and make it first-class and not trivially substituted with type signatures (even if we know it’s not the case, not everything can (or should) be encoded at the type level unless we get a wonderful new metatheory for Christmas)

rae · December 30, 2022, 12:23pm

I’m against this change, mostly because the information could be wrong in either direction: a partial function might not get the mark (as demonstrated above) while a complete function might (if it works with known-infinite lists, for example). With the inaccuracy of the check, I don’t think the information is all that useful – instead, this strikes me as a mechanism to expose a programmer’s stylistic choice to users, which I don’t think is helpful.

Instead, what about a new bit of markup or special field in Haddock that allows library authors to document partiality? This could be paired with the check proposed, but if a function has a partial match, the warning is suppressed if it is documented. (There would need to be another way to suppress the warning in case the warning is spurious.) This gives control back to the library author, while still enriching our documentation with partiality information.

Kleidukos · December 30, 2022, 12:34pm

Wholeheartedly agree!

david-christiansen · January 2, 2023, 12:46pm

One terminological thing to be aware of is that “partial” is often used differently by Haskell programmers than it is by others. In particular, we typically use the term “partial” to refer to functions that might fail/crash due to an incomplete pattern match, while usage outside of Haskell also includes functions that fail/crash with an infinite loop, by calling exit, or in other ways. In other words, functions that are assigned a type A -> B but that don’t assign elements of B to all possible elements of A.

This is important in proof assistants based on various MLTT- or CoC-like formalisms, because the existence of a proof (and thus the fact that a given type is inhabited) is all that matters. Even fancy dependent Haskell is strict in its proof objects, so directly copy-pasting the design from some other system isn’t necessarily the right way to go.

If I had this feature in Haskell, I think I’d rather know “which values can I safely pass to this function” rather than “do there exist values for which the function is not safe”. In other words, my favorite bike shed color would enable Haddock output like:

head :: HasCallStack => [a] -> a

O(1). Extract the first element of a list, which must be non-empty.

Missing pattern: []

I’m imagining something like:

-- | \(\mathcal{O}(1)\). Extract the first element of a list, which must be non-empty.
--
-- >>> head [1, 2, 3]
-- 1
-- >>> head [1..]
-- 1
-- >>> head []
-- *** Exception: Prelude.head: empty list
--
-- WARNING: This function is partial. You can use case-matching, 'uncons' or
-- 'listToMaybe' instead.
--
-- @missing []
head                    :: HasCallStack => [a] -> a
head (x:_)              =  x
head []                 =  badHead

with one @missing case for each pattern that is not covered. This also allows things like Prelude.head that throw their own custom exceptions manually to use this feature.

A tool could optionally check that GHC doesn’t see undocumented cases. Then, I could also document that things that call head are likely to fail in similar ways, without tying it too tightly to the GHC warning mechanism.