To go or to guard?

A large proportion of functions in Base are defined via a helper function, typically named go:

--  Data.Set
member :: Ord a => a -> Set a -> Bool
member = go
  where
    go !_ Tip = False
    go x (Bin _ y l r) = case compare x y of
      LT -> go x l
      GT -> go x r
      EQ -> True
--  GHC.Base
foldr            :: (a -> b -> b) -> b -> [a] -> b
foldr k z = go
          where
            go []     = z
            go (y:ys) = y `k` go ys

In particular the idiom with that foldr is the go calls itself in tight recursion; the original arguments to the outer/public-facing function (k, z) remain constant.

In other examples, there’s some pre-calculation from the outer arguments, to hand a local value to the helper:

clunky' env vars@(var1:var2:_)
 = help (lookup env var1) (lookup env var2) vars
where
 help (Just v1) (Just v2) vars   = v1 + v2
 help _         _         [var1] = ... some stuff
 help _         _         []     = ... some stuff

we do get three equations, one for each right-hand side, …
[page 3 of Pattern Guards and Transformational Patterns, Erwig+SPJ, Haskell Workshop 2000]

(You can tell this is a somewhat made-up example: help's [var1] pattern isn’t going to match vars if vars has at least two elements per clunky''s top line.)

The complaint “… it is still clunky. In a big set of equations it becomes hard to remember what each Just pattern corresponds to. …” is I’d say a matter of taste. This form is a lot less clunky than the almost perversely clunky form given first in Section 2.2. The form the paper is arguing for/proposing is (for a slightly different signature):

clunky env var1 var2
   | Just val1 <- lookup env var1
   , Just val2 <- lookup env var2
   = val1 + val2
   ... other equations for clunky

As against the clunky' form, this has just as many variables and just as many Just patterns to remember. (These days we might code using ViewPatterns so var1, var2 are implicit/not named. We still need to keep track of just as many Just patterns.)

The proposal went ahead and became part of the H2010 standard. AFAICT, Base doesn’t use that style of guards much. Is that merely because most of it dates to H98?

To my reading, coding the lookups once at the beginning [**] seems cleaner/reduces clutter, and treats the two arguments more symmetrically. Yes there’s a need to introduce a helper/go; that feels natural as an idiom once you’re used to the idea.

In your code do you prefer those pattern-match guards or a helper? Does go style seem old-fashioned?

[**] In particular, suppose we want to branch depending whether both lookups succeed vs only one does vs neither do. With the H2010 style guards or ViewPatterns that needs repeating the code for the lookups. [“exposes a small deficiency”, page 4] The optimiser might eliminate common subexpressions, but we still get double/triple the code to read.

1 Like

I find it more robust to do explicit pattern matching since GHC warns you whether you’re missing any cases.

Pattern Guards and Transformational Patterns - the paper with no case-expression example of pattern guards? A rather peculiar omission - I would be wary of using it as the basis of comparisons…

:woozy_face:. 2000 was another age, in terms of extensions to the language. They were arriving at a furious pace, with no need to ask permission first – especially if your name was Jones. So the paper is rather: this is what we’ve done, FYI. No Committee; no approvals process.

This is a somewhat unusual extension: it’s not introducing anything you couldn’t already express – indeed the Semantics (section 5) merely translates the new syntax into H98.

I guess the grounds of expressivity/elegance/succinctness are tricky to illustrate in a conference paper: examples should be small and focused; but the improvements are really only apparent with large-scale complex logic using many parameters.

I was using the code in Base for comparisons. There’s not many case expressions in general; their guards (if any) are Boolean/H98-compliant.

A rather more interesting question: of all the extensions over H98 that were in both main compilers by 2010, why was Pattern Guards one of the few that got adopted into the new standard? Especially if they turn out to be not much used.

(I very seldom see them in code examples on discussion forums.)

I at least use them reasonably often. They can be very helpful in scenarios with a lot of pattern-matching.

2 Likes

In my opinion the solution

clunky' env vars@(var1:var2:_)
 = help (lookup env var1) (lookup env var2) vars
where
 help (Just v1) (Just v2) vars   = v1 + v2
 help _         _         [var1] = ... some stuff
 help _         _         []     = ... some stuff

that they reject as too clunky is the solution that I prefer. I personally dislike guards, pattern guards and similar extensions which look like syntactic sugar for an inherently imperative control-flow construct to me: if-else chains. These features destroy the properties that I like about pattern matching: You cannot really do equational reasoning with clauses which use guards, and exhaustiveness-checking becomes a best-effort since the exhaustiveness of boolean guards is in general undecidable. I prefer other extensions which make patterns more expressive and don’t destroy these properties, such as or-patterns, view-patterns and pattern synonyms.

3 Likes

They explain why they reject it:

Worse, we cannot use one lookup in the next. For example, suppose our function was like this:

clunky'' env var1 var2 | ok1 && ok2 = val2
                       | otherwise = var1 + var2
  where
    m1 = lookup env var1
    m2 = lookup env (var2 + val1)
    ok1 = isJust m1
    ok2 = isJust m2
    Just val1 = m1
    Just val2 = m2

Notice that the second lookup uses val1, the result of the first lookup. To express this with a help function requires a second helper function nested inside the first. Dire stuff.

Would you still prefer having many helper functions, one for each match?

Although, I think I’d just use the Maybe monad:

clunkyM env var1 var2 = fromMaybe (var1 + var2) $ do 
  val1 <- lookup env var1
  lookup env (var2 + val1)

But then what do you with the multiple clause variant? E.g. a function like this:

clunkyM env bool var1 var2
  | Just val1 <- lookup env var1
  , Just val2 <- lookup env (var2 + val1)
  = val2
clunkyM env False var1 var2 = var1
clunkyM env True var1 var2 = var2
3 Likes

I see their problem, I just don’t agree with them that pattern matching is the tool that we should use to solve it. I think the issue is that there are two different perspectives on what pattern matching should be: Either a general purpose control-flow construct which is allowed to perform arbitrary computations and tests in order to find the correct branch to dispatch to, or else a more restricted form which only allows to dispatch on the shape of of the scrutinee, but which is more amenable to static analysis by the compiler.

I think it is better to use pattern matching in the restricted sense and to use other control flow mechanisms if we need to dispatch using complex computations. For example, we can use MultiWayIf which also allows to bind variables in guards (which wasn’t yet available when the paper was written, it was only introduced in GHC 7.6.1). We could write something like:

clunky env var1 var2 = if | Just val1 <- lookup env var1
                          , Just val2 <- lookup env var2  = val1 + val2
                          | ... other clauses

If we use a multi-way-if instead of pattern matching we signal more explicitly that we don’t expect the compiler to assist us with exhaustiveness checking.

1 Like

Yes-ish. We had some discussion around that with the ‘Boolean blindness’/algebraic blindness topics. In a lazy language, you don’t have to think of if-else as control-flow, but I’d prefer more symmetrical:

case blah of
  Just x  -> ...
  Nothing -> ...

I agree the exhaustiveness-checking/incomplete patterns is crucial for large scale.

Yeah. Is that example realistic? My o.p. pointed out how contrived some of this code seems. If lookup var2 relies on lookup var1 then we don’t have a symmetrical requirement, and some nesting matches the logic. I’d go (trying to stick close to the original)

clunky'' env var1 var2
 = case lookup env var1 of
     Nothing     -> var1+var2                            -- [**]
     (Just val1) -> help' env (Just val1) (lookup env (var2+val1))
   where
      help' env (Just val1) (Just val2) = val1 + val2
      help' ...                         = ... as before

[**] Being able to name the cases there is much more helpful documentation than an if-else.

Exactly. That question bites with their clunky'' form. If lookup env var1 succeeds, you’d want to take advantage of that (use val1) even if lookup var2 fails.

(But we could get deep into the weeds with one example. There’s a realistic use case of the left parse driving a later parse Erlang-style parsing. Myself I find that code unreadable, and sufficient evidence on its own to reject ViewPatterns. “Dire stuff” indeed.)

Quite realistic in business code. It’s rare to write such abomination in a generic open-source library (so base gravitates to a pattern with go helper function), but I’ve seen tons of business logic where pattern guards are very helpful. It’s a choice between 5-6 levels of case or PatternGuards, where the latter wins simply because of less indentation needed.

1 Like

Thank yous. Would it be possible to see some gnarly examples where guards are the more helpful/least-bad idiom? I appreciate that code will be chunkier than the toy examples we’ve looked at so far.

(“Business code”/at work is not where I’m allowed to write Haskell. My code’s gnarlies are from walking the database.)

2 posts were split to a new topic: Why I like guards and multi-way ifs

Here’s some business code making use of pattern guards, among other things: preInlineUnconditionally. It falls into the “MultiWayIf” use case and would not be fun to write with if/else cascades.

It also demonstrates perfectly that pattern guards help to define more programs in the total-case-of style: By isolating the complicated decision logic into a function returning a Maybe SimplEnv, users can simply pattern match on the result of this function instead of replicating all the control-flow of its implementation.

A language construct that IMO is kind of missing is one where I can mix control-flow and failure in MonadZero, haskell - Pattern matching in `Alternative` - Stack Overflow. Of course, threading the success and failure continuation manually works, but it kind of defeats the purpose of first-match semantics.

Thank you for a gnarly one. Again, we all agree if-else cascades are unreadable; so you’re tilting at the wrong windmill.

In ~30 lines of actual code, there’s only one example (, Just inl <- ...) of the construct I’m asking about. Not clear why that’s not declared in the where like everything else. Looking up and down that module, there’s a few other examples, but none really unavoidable.

I would say that preInlineUnconditionally suffers Boolean blindness, except all the Booleans have long_and_meaningful_names. Given how many of the branches return Nothing: if I didn’t have those ever-so useful Note [ ]s, it would take unreasonable efforts to figure out which params were driving the conditional logic (env, top_lvl, bndr) vs which params contribute to the Just's payload (rhs, rhs_env). I think I’d try a helper go over those three pre-processed to reveal the actual interesting conditions.

Sounds like an excellent use-case for a PatternSynonym rather than more Maybe-blindness: “becomes hard to remember what each Just pattern corresponds to” [SPJ&Martin E].

Here’s another example of pattern guards in business-adjacent logic. It may or may not be convincing. I reach for them quite often, as they tend to require less refactoring than other solutions. Smaller PRs are always a win when working in a team.

It’s been more than 20 years since I’ve read the paper, but I thought there was another compelling reason for pattern guards, quite apart from battling clunkiness. With this extension, there’s a neat symmetry between pattern guards, list comprehensions, and do expressions: they all support exactly let, <-, and expressions, so their abstract syntax is equivalent except a do must end with an expression. There’s also a glaring difference at the type level, sadly: expressions must be of type Bool in guards and list comprehensions, and of type m a in do.

2 Likes

Great thank you! I’m convinced.

Function fill examines complex conditions, calculated from its arguments. (Some) conditions must be examined in a particular sequence: they’re overlapping so testing for a later one makes sense only when the earlier one has failed. Then there’s quite a strong imperative ‘flavour’.

And none of those calculations need re-calculating (part of) a condition already examined.

There are several Just v <- ...s but low risk of ’Maybe blindness’ because each branch has only one v in scope.

incomplete-patterns checking doesn’t stand a chance.

Hmm I can’t guess what you’re trying to remember. I linked the paper in the o.p. [**] Yes the parallel to list comprehensions is explicitly given for the choice of syntax. (I think Monad comprehensions also were part of GHC at the time?; they later got taken out.)

The paper includes a separate proposal ‘transformational patterns’ that didn’t go ahead at the time. Essentially the same idea arrived later as ViewPatterns.

[**] The only version I could find seems to be a rather broken .pdf. Acrobat won’t search within it.