Short cut fusion + NonEmpty

GHC has two fusion functions, build and augment (link), but these are defined for regular lists.

For non-empty lists I inevitably end up with

case augment _ [x] of
  []     -> error "impossible" -- Reordered by GHC to be on top
  y : ys -> y :| ys

Is there any way to bully GHC into accepting ~(y:ys) = augment _ [x] in this one spot or do I have to ship an inaccessible error message?

I haven’t tried augment, but there is a version of build that can be written for non-empty lists:

-- | Similar to '(:|)', but instead with a 'Maybe' 'NonEmpty'.
pattern (:&?) :: x -> Maybe (NonEmpty x) -> NonEmpty x
pattern x :&? y <- (x :| ~(nonEmpty -> y))
  where
    x :&? y = maybe (x :| []) ((x :|) . toList) y

{-# COMPLETE (:&?) #-}

build1 :: forall x. (forall y. (x -> Maybe y -> y) -> Maybe y -> y) -> NonEmpty x
build1 f = f (:&?) Nothing

What happens if you use lazy or irrefutable pattern-matching?

  • let y:ys = augment … [x] in y:|ys
    
  • case augment … [x] of ~(y:ys) -> y:|ys
    
  • (\ ~(y:ys) -> y:|ys) (augment … [x])
    

Both result in non-exhaustive pattern match warnings and in GHC inserting

[] -> throw (PatternMatchFail "src/Your/Module/Path.hs:line:col-col|y : ys")

, which is even less desirable than a custom errorWithoutStackTrace.

GHC now rejects annoys alerts you to the use of lazy and irrefutable patterns if they’re partial? The totalitarians have been busy!

When dealing with such intransigence, just keep it simple:

case augment … [x] of
  []     -> ending
  y : ys -> y :| ys

…where:

ending :: a
ending = let msg = error msg in error msg

@BurningWitness what do you mean by “GHC inserting”? Is your goal to obtain a Core program without an error call?

If you want to avoid the error then I think you’ll have to use unsafeCoerce, e.g.:

data Vec n a where
  VNil :: Vec 0 a
  VCons :: a -> Vec n a -> Vec (1 + n) a

main = let VCons x _ = unsafeCoerce [0..3 :: Int] :: Vec 3 Int in print x

@Bodigrim “GHC inserting” as in no matter what I do GHC produces the pattern match I outlined in the topic. The goal is to produce a program where [] is never pattern matched against. In the unlikely event that the return is an empty list I’m fine with undefined behavior.

Admittedly there is weirdness to this request:

  • If GHC decides to do a faulty optimization with this in the middle, whoever is looking into it may have to investigate a segfault instead of a throw;

  • Depending on how the code gets compiled, this pattern match could be a conditional and the branch predictor could figure out [] never gets accessed. The price of a throw in this case would be quite a bit lower.


@jaror Oh hey, this works.

The caveat is that the definition of Vec has to exactly match that of the List, which feels brittle, since I don’t know whether List's definition is set in stone. Makes me wish there were a way to either add the extra proof over a newtype or to extract the non-emptiness of a [x] singleton list.

I think what @mixphix is suggesting is a better approach than unsafely coercing, namely to write new rewrite rules for NonEmpty. You could do it with Maybe, but you can also just write the fold and build directly on the normal NonEmpty type:

{-# INLINE [0] foldNonEmpty #-}
foldNonEmpty :: (a -> b -> c) -> b -> (a -> b -> b) -> NonEmpty a -> c
foldNonEmpty k1 z k (x :| xs) = k1 x (foldr k z xs)

{-# INLINE [0] buildNonEmpty #-}
buildNonEmpty 
  :: (forall c b. (a -> b -> c) -> b -> (a -> b -> b) -> c)
  -> NonEmpty a
buildNonEmpty f = f (:|) [] (:)

{-# RULES
  "foldNonEmpty/buildNonEmpty"
    forall k1 z k 
           (f :: forall c b. (a -> b -> c) -> b -> (a -> b -> b) -> c). 
    foldNonEmpty k1 z k (buildNonEmpty f) = f k1 z k
  #-}
2 Likes

The issue with NonEmpty is that the last element needs to somehow cross the (:|) in the specific case when it’s the only element, and I don’t know if that can be proven to the compiler without adding extra state. If I use the other non-empty list, it’s rather trivial:

data List1 a = One a | Cons1 a (List1 a)

{-# INLINE [0] foldr_ #-}
foldr_ :: (a -> b -> b) -> (a -> b) -> List1 a -> b
foldr_ f z = go
  where
    go (One a)      = z a
    go (Cons1 a as) = f a (go as)

{-# INLINE [1] build1 #-}
build1 :: Build a -> a -> List1 a
build1 g x = g Cons1 (One x)

{-# RULES
"foldr_/build1"    forall k z (g :: forall b. (a -> b -> b) -> b -> b) x .
                   foldr_ k z (build1 g x) = g k (z x)
  #-}

Turned out I don’t actually need it: I can instead have (a -> b -> b) and (a -> b) as extra arguments in the function producing the list and provided I align function arities (which I didn’t do the first time), GHC will inline it all the same.