This is a non-terminating expression: repeat id <*> []
.
The reason for this is that the Applicative []
instance is defined using list comprehensions:
instance Applicative [] where {-# INLINE pure #-} pure x = [x] {-# INLINE (<*>) #-} fs <*> xs = [f x | f <- fs, x <- xs] {-# INLINE liftA2 #-} liftA2 f xs ys = [f x y | x <- xs, y <- ys] {-# INLINE (*>) #-} xs *> ys = [y | _ <- xs, y <- ys]
The source code points to this Note by way of explanation:
Note: [List comprehensions and inlining]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The list monad operations are traditionally described in terms of concatMap:xs >>= f = concatMap f xs
Similarly, mconcat for lists is just concat. Here in Base, however, we donât
have concatMap, and weâll refrain from adding it here so it wonât have to be
hidden in imports. Instead, we use GHCâs list comprehension desugaring
mechanism to define mconcat and the Applicative and Monad instances for lists.
We mark them INLINE because the inliner is not generally too keen to inline
build forms such as the ones these desugar to without our insistence. Defining
these using list comprehensions instead of foldr has an additional potential
benefit, as described in compiler/GHC/HsToCore/ListComp.hs: if optimizations
needed to make foldr/build forms efficient are turned off, weâll get reasonably
efficient translations anyway.
This is a reasonable explanation for Monad []
, but for Applicative []
it begs the question of why (something equivalent to) a concatMap
-based implementation is desirable in the first place. A guarding clause on (<*>)
and friends suffices to solve the problem:
instance Applicative [] where
[] <*> _ = [] -- EDIT: added to prevent `[] <*> undefined` from becoming an error
_ <*> [] = []
fs <*> xs = [f x | f <- fs, x <- xs]
-- etc.
Iâm sure this was proposed by someone at some point in the history of Haskell. But has anyone written up why itâs inferior to what we have? Is it list fusion? If so, itâs because (<*>)
would no longer be a âgood consumerâ of its second argument, but it would still be a good consumer of its first and a âgood producerâ overall, right? (I have been looking for an adequate definition of these hinted-at terms and have come up empty.) In what circumstances does this difference matter?
Could the desired list fusion properties be restored with this (edited to start from the actual desugaring of the list comprehension; edited again to handle [] <*> undefined
)?
fs <*> xs =
build (\c n ->
foldr (\f b1 ->
foldr (\_ _ ->
foldr (\x b2 -> c (f x) b2) b1 xs
) n xs
) n fs
)
Or is that not how it works?
If a one-clause list comprehension implementation is better for other reasons, could GHC do an ApplicativeDo
-like analysis on list comprehensions (that is, detect cases where later source lists donât depend on earlier ones) and produce code for [f x | f <- fs, x <- xs]
that surely terminates if either fs
or xs
is empty? What are the trade-offs here?