Or I guess the reason why that can’t be done is simply the fact that S has a ⊥ as a possible state and GHC won’t ever bother proving the function never returns it.
Which would mean the answer is either an unboxed one-tuple or declaring S explicitly unlifted.
They’re not the same once you apply them uninlined, since you get roughly
case Main.otherB Main.L of x [Occ=Once1!] {
__DEFAULT -> ...
}
and
case Main.otherU Main.L of
Solo# x [Occ=Once1] -> ...
The desired behavior is doing as few evaluations as needed.
For an example think of a dictionary that maps values to unlifted constructors of S and some lookupR :: Tree -> S that is strictly used as an intermediate step in a function. Invoking let !s = lookupR t not only runs lookupR, it also evaluates the s, which shouldn’t be necessary in this particular case.
The unboxed tuples approach should be as performant as the default one if GHC is smart enough, so the question is really just “do I need to bother with the tuples or am I wasting my time?”.
Can you give a full example then? I still cannot reproduce it with this:
{-# LANGUAGE UnboxedTuples #-}
module SLR where
data S = L | R
{-# NOINLINE otherB #-}
otherB :: S -> S
otherB L = R
otherB R = L
{-# NOINLINE otherU #-}
otherU :: S -> (# S #)
otherU L = (# R #)
otherU R = (# L #)
mainB :: S
mainB = case otherB L of
L -> R
R -> L
mainU :: S
mainU = case otherU L of
(# L #) -> R
(# R #) -> L
That seems to give “good” STG:
otherB =
\r [ds_slE]
case ds_slE of {
L -> R [];
R -> L [];
};
L = L! [];
mainB =
\u []
case otherB L of {
L -> R [];
R -> L [];
};
R = R! [];
otherU =
\r [ds_slH]
case ds_slH of {
L -> Solo# [R];
R -> Solo# [L];
};
mainU =
\u []
case otherU L of {
Solo# ipv_slK ->
case ipv_slK<TagProper> of {
L -> R [];
R -> L [];
};
};
(The <TagProper> means that the second case is not actually doing an evaluation.)
I tend to pack stuff into IORefs to kill GHC optimizations on that front.
main :: IO ()
main = do
a <- newIORef L
x <- readIORef a
let !x' = otherB x
writeIORef a x'
b <- newIORef L
y <- readIORef b
let !(# y' #) = otherU y
writeIORef b y'
That is precisely the point of the question, the y' at this point in the application should never be a box because these functions only return an L or R, i.e. a pointer to a shared constructor value.
Whether a variable is demanded is a static property and whether a value is a thunk (all pointers are boxes, some are thunks) is a dynamic property.
Statically, GHC will not try to evaluate the result of otherU unless you explicitly demand it, but dynamically it will not be a thunk because the otherU function does not produce thunks. (And GHC does not introduce a new thunk to store the result of otherU because you do demand the unboxed tuple.)
Lifted values only cause redundant work when (1) thunks are created to store them in, or (2) they are demanded and thus evaluated (which means that it checks whether the value is a thunk and if so runs it). In your program the otherU function does not create thunks, so (1) is not an issue. It seems you are mostly worried about (2), but that cannot cause an issue if you never demand the value.
If you do demand the y' then (2) might be an issue, but I’m showing that it produces a case match with a <TagProper> annotation, which is GHC’s way of telling you that it knows that the value is not a thunk and thus no evaluation needs to happen.