Optimization for Multiple Applications of a Function that is Identity After One Application

Is ghc optimization smart enough to remove multiple applications of a function that turns into the identity function after one application. For example:

abs . abs
-- or
f :: Int -> Bool
f x = if y > 5 then False else g y 
  where y = abs x
g :: Int -> Bool
g x = y > 0
  where y = abs x
1 Like

Are you talking about idempotency? Idempotent functions?

I don’t think GHC has enough information to recognize idempotency in something like abs . abs. That would mean functions would need to get an “idempotent” tag, or something? :man_shrugging:
and f can’t be applied twice, because it returns a Bool, so f . f doesn’t type check; did you mean a different kind of function? (or a different kind of “multiple applications”?)

Main.hs:

f = abs . abs

g = abs

main :: IO ()
main = do
    print $ f (-7)
    print $ g (-7)

Running with (9.10.1): ghc --make Main.hs -ddump-simpl > core.txt

      ($ @GHC.Types.LiftedRep
         @GHC.Types.LiftedRep
         @Integer
         @(IO ())
         (print @Integer GHC.Internal.Show.$fShowInteger)
         (. @Integer
            @Integer
            @Integer
            (abs @Integer GHC.Internal.Num.$fNumInteger)
            (abs @Integer GHC.Internal.Num.$fNumInteger)
            (negate
               @Integer GHC.Internal.Num.$fNumInteger (GHC.Num.Integer.IS 7#))))
      ($ @GHC.Types.LiftedRep
         @GHC.Types.LiftedRep
         @Integer
         @(IO ())
         (print @Integer GHC.Internal.Show.$fShowInteger)
         (abs
            @Integer
            GHC.Internal.Num.$fNumInteger
            (negate
               @Integer GHC.Internal.Num.$fNumInteger (GHC.Num.Integer.IS 7#))))

No optimization in sight.
Now with -O2:

-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
Main.main3 :: Integer
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True,
         Value=True, ConLike=True, WorkFree=True, Expandable=True,
         Guidance=IF_ARGS [] 10 10}]
Main.main3 = GHC.Num.Integer.IS 7#

The functions got obliterated. But maybe that’s unfair since it’s a constant. Trying something else:

f = abs . abs

g = abs

main :: IO ()
main = do
    x <- getLine
    let y = read x :: Integer
    print $ f y
    print $ g y

Without -O2:

           ($ @GHC.Types.LiftedRep
              @GHC.Types.LiftedRep
              @Integer
              @(IO ())
              (print @Integer GHC.Internal.Show.$fShowInteger)
              (f_rh6 y_aCj))
           ($ @GHC.Types.LiftedRep
              @GHC.Types.LiftedRep
              @Integer
              @(IO ())
              (print @Integer GHC.Internal.Show.$fShowInteger)
              (g_rh7 y_aCj)))

The functions didn’t get inlined and they are indeed two composed abs and one abs, respectively.
Now with -O2:

             (GHC.Internal.Show.$w$cshowsPrec1
                0#
                (GHC.Num.Integer.integerAbs (GHC.Num.Integer.integerAbs y_s1KP))
                (GHC.Types.[] @Char))

        (GHC.Internal.Show.$w$cshowsPrec1
           0# (GHC.Num.Integer.integerAbs y_s1KP) (GHC.Types.[] @Char))

Still two abs and one abs.
Now trying with -fenable-rewrite-rules:

{-# RULES "abs/dot" (.) abs abs = abs #-}

f x = (abs . abs) x

g = abs

main :: IO ()
main = do
    x <- getLine
    let y = read x :: Integer
    print $ f y
    print $ g y

With no -O2:

           ($ @GHC.Types.LiftedRep
              @GHC.Types.LiftedRep
              @Integer
              @(IO ())
              (print @Integer GHC.Internal.Show.$fShowInteger)
              (abs @Integer GHC.Internal.Num.$fNumInteger y_aCl))
           ($ @GHC.Types.LiftedRep
              @GHC.Types.LiftedRep
              @Integer
              @(IO ())
              (print @Integer GHC.Internal.Show.$fShowInteger)
              (g_rh9 y_aCl)))

With -O2:

             (GHC.Internal.Show.$w$cshowsPrec1
                0#
                (GHC.Num.Integer.integerAbs (GHC.Num.Integer.integerAbs y_s1KP))
                (GHC.Types.[] @Char))

        (GHC.Internal.Show.$w$cshowsPrec1
           0# (GHC.Num.Integer.integerAbs y_s1KP) (GHC.Types.[] @Char))

Right, the rule didn’t fire. Tuning it a bit:

{-# INLINE [2] f #-}
f x = (abs . abs) x

Now the core shows:

             GHC.Internal.IO.StdHandles.stdout
             (GHC.Internal.Show.$w$cshowsPrec1
                0# (GHC.Num.Integer.integerAbs y_s1LD) (GHC.Types.[] @Char))
             GHC.Types.True
             ipv_a14r
      of
      { (# ipv2_a1zl, ipv3_a1zm #) ->
      GHC.Internal.IO.Handle.Text.hPutStr2
        GHC.Internal.IO.StdHandles.stdout
        (GHC.Internal.Show.$w$cshowsPrec1
           0# (GHC.Num.Integer.integerAbs y_s1LD) (GHC.Types.[] @Char))
        GHC.Types.True
        ipv2_a1zl

Verdict: unless you have rules for them, GHC can’t tell.

5 Likes

The only scenario I can imagine the compiler can (and must!) detect idempotency is in case of an ADT when the semantics of the function can be analyzed via pattern matches. Otherwise, Data.Function.fix would be impossible. Continuing the abs example:

data SignedInt = SignedInt {
    sign :: !Ordering,
    absValue :: !Int}

fromInt :: Int -> SignedInt
fromInt i = SignedInt (compare i 0) (abs i)
{-# INLINE fromInt #-}

taggedAbs :: SignedInt -> SignedInt
taggedAbs (SignedInt LT i) = SignedInt GT i
taggedAbs s = s
{-# INLINE taggedAbs #-}

main = interact (show.absValue.taggedAbs.taggedAbs.fromInt.read)
1 Like

Really great answer! Thanks. I need to start learning core.

You can make it a lot simpler to read with some tricks:

  1. Make your functions as simple as possible. Avoid I/O if possible.
  2. Suppress unnecessary info like module qualifiers and type applications using -dsuppress-all. (Unless you actually care about that info)
  3. Do not suppress type signatures (-dno-suppress-type-signatures). Type signatures often tell me a lot about which functions I’m looking at if the names have been mangled, so I re-enable this option.
  4. Disable “typeable binds” using -dno-typeable-binds. By default, GHC generates a bunch of special bindings to support the Typeable mechanism, but these are often not necessary. This can break programs that use Typeable, though.
  5. Use -fforce-recomp if you’re only changing flags and not the code itself. Otherwise it will output nothing.

My full command is then:

ghc -ddump-simpl -dsuppress-all -dno-typeable-binds -dno-suppress-type-signatures -fforce-recomp

(With optionally -O or -O2 for optimizations)

Using those tips we might start with a program like this:

module T where

f = abs . abs
g = abs

Which yields this output (if compiled with optimizations enabled):

f :: Integer -> Integer
f = \ (x_aAh :: Integer) -> integerAbs (integerAbs x_aAh)

g :: Integer -> Integer
g = integerAbs

Now you might try adding a rewrite rule:

module T where

{-# RULES "abs idem" forall x. abs (abs x) = abs x #-}

f = abs . abs
g = abs

Now if we compile with optimizations (-O and -O2 both imply -fenable-rewrite-rules and both give the same result in this case) we get… no change:

f :: Integer -> Integer
f = \ (x_aAv :: Integer) -> integerAbs (integerAbs x_aAv)

g :: Integer -> Integer
g = integerAbs

Wait, but didn’t @darkxero’s approach work? Well, yes, but that was really a coincidence. You might have noticed that I’m using a slightly different definition of f. Let’s see what happens when I use the exact same:

module T where

{-# RULES "abs idem" forall x. abs (abs x) = abs x #-}

f x = (abs . abs) x
g = abs

Compiling without optimizations gives us:

f :: forall {c}. Num c => c -> c
f = \ (@c_azf) ($dNum_azp :: Num c_azf) (x_agV :: c_azf) ->
      . (abs $dNum_azp) (abs $dNum_azp) x_agV

g :: Integer -> Integer
g = abs $fNumInteger

Now we can see that the type of f has changed drastically. It is now overloaded over the Num type class instead of specialized to Integer. Why did this happen? Because of the monomorphism restriction. In short, if we don’t write a type signatures then the compiler will only generalize, i.e. add type class constraints, to function definitions with at least one explicit argument. So it add the constraints to f, which has one explicit argument x, but not to g which has no explicit arguments.

If we now compile with optimizations we can see that the rewrite rule does work in this case:

f :: forall {c}. Num c => c -> c
f = \ (@c_azf) ($dNum_azp :: Num c_azf) (x_agV :: c_azf) ->
      abs $dNum_azp x_agV

g :: Integer -> Integer
g = integerAbs

Now why is that? We can find a hint in the dump of the rewrite rule which is conveniently included (but we could otherwise use -ddump-rules to view them):

"abs idem"
    forall (@a_azD)
           ($dNum_azE :: Num a_azD)
           ($dNum1_azG :: Num a_azD)
           (x_ajy :: a_azD).
      abs $dNum_azE (abs $dNum1_azG x_ajy)
      = abs $dNum_azE x_ajy

This shows that the rewrite rule only works on the general abs method from the Num class, but not on the specialized integerAbs function. And specialization often happens before other rules can fire. GHC does tell us that a bit cryptically in a warning message:

T.hs:3:11: warning: [GHC-87502] [-Winline-rule-shadowing]
    Rule "abs idem" may never fire
      because rule "Class op abs" for ‘abs’ might fire first
    Suggested fix: Add phase [n] or [~n] to the competing rule
  |
3 | {-# RULES "abs idem" forall x. abs (abs x) = abs x #-}
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The “Class op abs” is a special rewrite rule that GHC uses to specialize abs (to absInteger in this case). The suggestion to add phases is unhelpful in this case because the “Class op abs” rule is automatically generated by GHC and cannot be modified by you.

In conclusion, use some options to suppress unnecessary information, and rewrite rules on type class methods are not really useful.

6 Likes

f can call g and both apply abs

I take that back and claim the opposite. Data.Function.fix does not work with any function that employs pattern matching to determine the constructor of the output. Still, patterns could facilitate a sufficient amount of static analysis to detect idempotency.

You are right.
proceeds to cheat

module T where

{-# RULES "dabs/dot" dabs . dabs = dabs #-}

{-# INLINE [2] dabs #-}
dabs = abs

f = dabs . dabs

g = dabs

h x = (dabs . dabs) x
==================== Tidy Core ====================
Result size of Tidy Core
  = {terms: 8, types: 8, coercions: 0, joins: 0/0}

Rec {
dabs :: Integer -> Integer
dabs = integerAbs
end Rec }

f :: Integer -> Integer
f = integerAbs

g :: Integer -> Integer
g = integerAbs

h :: Integer -> Integer
h = integerAbs


------ Local rules for imported ids --------
"dabs/dot" forall. . dabs dabs = dabs

Thank you for the command, it’s very handy.

2 Likes