HELP! Hunting for Problematic Strictness in Products

I have recently discovered problematic strictness in the base package and need help finding and fixing it everywhere since it seems pervasive in a number of packages besides base.

Currently, I have created issues

I know of this problem in two types

  • base Data.Functor.Product.Product
  • base GHC.Generics.(:*:)

Also discussed in Defining Applicative for the Product of Two Monads

3 Likes

I’m not yet convinced that the Alternative/Alt (and MonadPlus maybe, though its laws are a mess) instances ought to be lazy. One of the laws for Alternative is empty <|> a = a, and if a is ⊥ then making <|> for Product lazy would mean that empty <|> a is Product _ _ and not ⊥.

I suppose the same logic should apply for Semigroup, given the Monoid laws, but I feel a little iffier about that for some reason.

1 Like

This is above my current knowledge, so if empty <|> ⊥ is evaluated where does Pair _ _ come from?

My uninformed thinking is that empty is evaluated and then <|> tries to unpack Pair but ⊥ is found and returned because we used empty <|> ⊥ and not empty <|> Pair ⊥ ⊥.

instance (Alternative f, Alternative g) => Alternative (Product f g) where
    empty = Pair empty empty
    Pair x1 y1 <|> Pair x2 y2 = Pair (x1 <|> x2) (y1 <|> y2)

empty <|> undefined == explosion
empty <|> ~undefined == Pair x1 y1 <|> ~undefined == Pair (x1 <|> explosion) (y1 <|> explosion)

Is it even possible to try pattern matching on the left side and the right side to check for empty, without exploding, and then return the bomb, or explode it if no matches are found?

I have realized this was a very stupid question because you blow up right after you get bottom again.
BUT BEHOLD
LAZINESS???

data Product f g a = Pair (f a) (g a) | PairIdentity (f a) (g a) deriving Show

instance (Alternative f, Alternative g) => Alternative (Product f g) where
    empty = PairIdentity empty empty
    p1 <|> p2 = unsafePerformIO $ do
        x <- c (f p1)
        y <- c (f p2)
        if x == 0 then return p2
        else if y == 0 then return p1
        else return $ Pair (x1 <|> x2) (y1 <|> y2)
        where
            f = (\(PairIdentity _ _) -> return 0)
            c x = catch x (\(_ :: SomeException) -> return 1)
            (Pair x1 y1) = p1
            (Pair x2 y2) = p2

The Semigroup (a, b) instance is also strict in this way, which means that while one can usefully write:

ones :: [Int]
ones = [1] <> ones -- not the usual way to do it, but it works

the following will loop out:

oneses :: ([Int], [Int])
oneses = ([1], [1]) <> oneses

Seems like the same vibe as the pattern you’re hunting, but on even less exotic ground.

I’m really not sure what would be the best thing to do about this even imagining that this was all being designed from scratch. Thinking too long about whether applying fast and loose reasoning to class laws is actually morally correct and whether tuples or Products are sufficiently like a categorical product despite Hask famously not being a real category leaves me feeling like I know less than I did to begin with.

2 Likes

This is cute but doesn’t address bottoms that represent forms of non-termination that the compiler doesn’t translate to an error, even aside from whatever the consequences of playing with unsafePerformIO may be.

I guess you could toss a timeout in there if you really don’t care about getting on the bad side of the Machine God.

LOGICIANS HATE THIS SIMPLE TRICK

import System.IO.Unsafe
import System.Timeout
import Control.Exception

isUndefined :: a -> Bool
isUndefined x = case unsafePerformIO (catch cf ef) of
    Just True  -> True
    Just False -> False
    Nothing    -> True
    where
        cf = timeout (maxBound :: Int) $ seq x $ return False
        ef = (\_ -> return (Just True)) :: SomeException -> IO (Maybe Bool)

f :: Int -> Int
f x
    | isUndefined x = -999
    | x == 0        = 0
    | x == 1        = 100
    | otherwise     = x

main = do
    print $ f 0
    print $ f 1
    print $ f 2
    print $ f undefined
1 Like

I don’t understand the role of timeout here.

There are non-terminating bottoms. For example:

f :: [Int]
f = f

main = print f

GHC can catch it and print

a.out: <<loop>>

Therefore:

main = print $ isUndefined f

Prints True, and then you don’t need the timeout, the exception gets triggered.

Now consider:

f :: Int
f = last [1..]

GHC doesn’t catch this one. If you don’t have the timeout and you just have:

cf = seq x $ return (Just False)

then the program runs forever.


“You solve the halting problem by resetting the server everyday at 3 am.” - Kurt Gödel

3 Likes

The Halting Problem Does Not Matter (1984)

1 Like

Indeed, my understanding is that class laws are subject to fast and loose reasoning, that is, ⊥ should not need to be considered at all.

If a lazier instance is needed, why not define a newtype MaximallyLazy a = ML a for which the Alternative, Semigroup, Applicative, etc. instances are as lazy as possible?

Otherwise be aware that rewriting all instances to be excessively lazy means slower code and more space leaks because the optimiser has less assurances to chew on.

1 Like

…or (and since Haskell is already non-strict by default) why not define:

newtype HyperStrict a = HS a

hyperstrict :: a -> HyperStrict a
hyperstrict x = case compel x of r -> HS r

where compel :: a -> a is a new primitive hyperstrict-evaluator function, as mentioned here and elsewhere.

Conal’s packages unamb and lub may be of relevance here.

If isUndefined really worked, then you’d have a non-monotone function which is impossible per domain semantics. (I’m not claiming that all of Haskell adheres to domain semantics, but a good deal, especially questions involving bottoms, does.)

That proof is not constructive, in the sense that the mere existence of Q is derived from the assumptions about P, without providing any algorithm. Phrasing in Haskell: Given p :: a -> b, it is postulated that there exists q :: a -> Bool so that q a = True if any only if p a is a total value, i.e. not (isUndefined (p a)). With the help of this q we can construct

p' :: a -> Maybe b
p' a = if q a then Just (p a) else Nothing

But in practise, this is not good enough, since we’re lacking an algorithm

haltingSet :: (a -> b) -> (a -> Bool)

that can compute the q for us, given p. The programmer must write q alongside writing p. The non-existence of haltingSet is the Halting Problem. Also notice that

isUndefined = not . flip ($) () . haltingSet . const
haltingSet f = not . isUndefined . f

are interdefinable.

It was mostly intended as a joke.
The problem of my isUndefined is that it considers as bottoms legitimate long running calculations.

So, if not laws, what ought to guide us in determining how strict class members ought to be in their various arguments? Is there any sort of relevant principle or are we always going to be groping around on a case-by-case basis for the best tradeoff between expressiveness and performance with no guidance from the domain semantics? (I find that prospect alarming but maybe it’s the best we can do?)

The laws could entail that without a fast and loose interpretation, the class has no members at all. The simplest example is the class Eq.
Proposition: Let ~ be an equivalence relation on a domain with ⊥. If ~ is also a congruence relation, that is, if a ⊒ x ~ y ⊑ b implies a ~ b, then a ~ b for all a, b. Proof: Take x = y = ⊥.
The documentation for == in Data.Eq states that == gives rise to an equivalence relation (“encouraged to”, actually) and since == is a monotone function, this relation is a congruence. The same applies to all classes with total class members, such as Ord etc.

Okay, okay. This is probably ground well-trodden decades ago by the Wise Ones of the Mailing Lists, if it isn’t so obvious that the Wise Ones never bothered to verbalize it, but here’s what I’m currently thinking. Epistemic status: I am a filthy pleb and I’ve been changing my mind about all this like twice a day.

Claim 1: A categorical product in a category with lifted types should not itself be lifted.

For my fellow plebs, a categorical product is defined by a unique way to combine given functions f : A → B and g : A → C to form a function 〈f, g〉 : A → B × C that correctly composes with projections. Assume f and g are chosen so that there is some a for which f(a) = g(a) = ⊥. If B × C is a lifted type, there are two valid ways to take the product of f and g:

  • 〈f, g〉(x) = (f(x), g(x)) (the usual way)
  • 〈f, g〉(x) = if x = a then ⊥ else (f(x), g(x))

We know these are distinguishable functions because ⊥ is distinguishable from (⊥, ⊥).

As a practical matter, making tuple types and Product literally unlifted types in Haskell seems like a lot of hassle.

Claim 2: The next best thing to using an unlifted product type is using the lifted pseudoproduct type but with a gentleperson’s agreement to consider ⊥ as equivalent to (⊥, ⊥).

This is a wishy-washy claim, but note that the above argument is no longer valid if we decide to ignore the difference between ⊥ and (⊥, ⊥). The projection functions fst and snd are constrained by monotonicity to map ⊥ to ⊥, so there is no other value we could consider as a candidate for equivalence to ⊥ that would rescue the categorical product.

We can compose any function that in its Platonic form would like an unlifted product type with the tuple-expansion function λx. (fst x, snd x), and any unlifted product type returned by a function will trivially embed in the lifted type.

Claim 3: A lazy tuple pattern is equivalent to composing with the tuple-expansion function.

I think this is straightforward.

Taken all together, this leads me to believe that to the extent that we want to think of tuples and Products as things with the categorical-product nature, we should make every tuple/Product pattern lazy if it makes any observable difference at all. Any law violations that rely on the difference between ⊥ and (⊥, ⊥) should be excused. If this conflicts with engineering concerns, we should trade those off against wanting to be able to reason abstractly about products.

My picture of what the engineering concerns are is still very fuzzy (Doesn’t GHC do a fair amount of strictness analysis, and doesn’t that mean that many lazy patterns end up being strictified? But when does this fail, and what other types of problems are lurking?), but at least I think I have a better idea of what the ideal would be sans those concerns.

The odds that someone will change my mind on this in half a day are still pretty high though, so I’m posting this here and not on the CLC thread so I don’t end up wasting the CLC’s time.

Can’t you just (informally) say "this function evaluates its first arg as deep as fst", separate from any equational laws, many of which probably were written under the implicit assumption of fast and loose reasoning? You could even come with a formal language to describe the strictness properties, à la projections ("if you evaluate fst (f x), that will evaluate x at least as much as snd").