Is eta-expansion required for inlining and what are the downsides?

The GHC docs on the INLINE pragma say the following:

Moreover, GHC will only inline the function if it is fully applied, where “fully applied” means applied to as many arguments as appear (syntactically) on the LHS of the function definition. For example:

comp1 :: (b -> c) -> (a -> b) -> a -> c {-# INLINE comp1 #-} comp1 f g = \x -> f (g x) comp2 :: (b -> c) -> (a -> b) -> a -> c {-# INLINE comp2 #-} comp2 f g x = f (g x)

The two functions comp1 and comp2 have the same semantics, but comp1 will be inlined when applied to two arguments, while comp2 requires three. This might make a big difference if you say

map (not `comp1` not) xs

which will optimise better than the corresponding use of comp2.

Does this mean if I’ve got a function in the form of:

{-# INLINE f #-}
f x = {- something small -}

I should be changing it to be like this?

{-# INLINE f #-}
f = \x -> {- something small -}

if there’s a significant chance of it being partially applied, such as in a call to map (so inside the map call the unfolding is exposed an it can actually be inlined)?

If the docs are right about this, my next question is why? If making something {-# INLINE #-} or {-# INLINABLE #-} exposes the implementation in the interface file why can’t the caller just transform it appropriately? It’s got all the code there anyway?

Also, is there any downside to doing this transformation if it is required for inlining?

The one thing I can think of is by transforming:

f x = {- something small -}

to

f = \x -> {- something small -}

Then f is now a CAF and CAFs can be bottom, although if one exposes the implementation of f via {-# INLINABLE #-}, I’d hope GHC can statically work out that f isn’t bottom and remove the check because even if:

f = \x -> undefined

f is still non-bottom (even if f 5 is bottom).

Presumably there is a downside of doing this transformation, as otherwise I would think GHC would do it automatically (instead of insisting people uglify their code by pushing their function arguments into lambdas).

7 Likes

Thanks for raising this. I would also be interested to know the answer. The current behaviour seems ad hoc.

I believe the idea here is to provide a heuristic that usually does the right thing, but gives the programmer control to prevent unnecessary inlining (which can easily cause code bloat). If you have map f xs then there’s not usually much point inlining to produce map (\ x -> body) xs, is there? In most (first-order) cases it’s only useful to inline f if there is an argument to which it will be applied, so the body can be further optimized based on that argument. (The comp1 example in the docs is an exception where it is useful to inline once the first two arguments are known, even if the final argument is not.)

1 Like

Adam is right. Inliniing f in map f xs, to give map (\x.blah) xs simply duplicates the body of f for no gain. So GHC lets you specify how many arguments the functions should be applied to before the function inlines.

No point in bloating code unnecessarily!

3 Likes

This was discussed in #23150: Making INLINE functions behave more like normal functions · Issues · Glasgow Haskell Compiler / GHC · GitLab where you mentioned:

Why do we have this “inline only if saturated” rule? The reasons seem to be lost in the mists of time. Clearly we should be rather cautious about changing this, but

  • we have no documented reason for why it is a good idea
  • it can make INLINE functions behave worse than non-INLINE ones

I’ve removed this constraint in !11776: Don't take arity into account with INLINE pragmas (#23150) · Merge requests · Glasgow Haskell Compiler / GHC · GitLab but there were some regressions iirc. Anyone should feel free to pick this up.

2 Likes

It seems clear indeed that we should have some control over how many arguments must be seen before we inline a function call, but what is not clear is why the control relates to the number of arguments syntactically on the LHS of a definition. Personally I find it very odd that

f x y =  ...

has different behaviour to

f = \x y -> ...

In particular, it makes \case, and especially \cases, rather dangerous. As a straw-man proposal, why not something like:

{-# INLINE_WHEN f x y #-}

versus

{-# INLINE_WHEN f #-}

(As @hsyl20 says, we should be very cautious about changing default behaviour, so this isn’t a real proposal just some food for thought.)

3 Likes

I’m inclined to agree. The current design isn’t the result of Deep Thought; it dates back a long time and was probably just the easiest thing to do at the time.

If someone wants to come up with an alternative design, write a short GHC proposal, and implement it (not hard), I’m all for it. The trickiest bit will be tihnking about the effects of any change on existing programs. Inlining can have a profound impact, and any regressions might be in performance (harder to spot) rather than “doesn’t compile any more”.

6 Likes

That ship has sailed with the advent of simplified subsumption. You can not even be sure that, if one version compiles, the other does, too.

Oh? Can you give an example? I thought that the former was sugar for the latter. What is it sugar for, then?

N.B. I’m not confused by the difference between f and \x -> f x, in case that’s what you were thinking of.

3 Likes

I’m even more confused now.

After reading the docs further (and it’s also referenced in the replies), the docs say:

While GHC is keen to inline the function, it does not do so blindly. For example, if you write

map key_function xs

there really isn’t any point in inlining key_function to get

map (\x -> body) xs

Perhaps I’m not understanding the definition of “inlining” in GHC. The notion of “inlining” I have in my head is

"Hey compiler, you see this call to f here? You thought you’d have to just push a whole lot of arguments onto the stack and do a JMP. Well guess what? Here’s the complete implementation of the code. You can just put all the instructions inline, no need for pushing arguments and jumping. Maybe you can adjust to code of f to just access the registers your data is already in. And if you’re lucky, you might be able to statically compute some stuff, like if you’ve doubled a variable, and f halves it, you’ll then know they cancel out, and won’t need to produce code for either! Couldn’t do that if f wasn’t inlined!

That’s a very C/C++ notion, but I thought the idea was similar in Haskell.

Because if it is, I can’t see why inlining isn’t useful in map key_function xs.

Lets say we have:

module A where

{-# INLINE key_function_1 #-}
key_function_1 = Just

{-# INLINE key_function_2 #-}
key_function_2 = id
module B where

my_map f = map (isJust . f)

l1_result = my_map A.key_function_1 l1

l2_result = map A.key_function_2 l2 

Then surely, in both cases couldn’t inlining be useful? In the first case, with inlining, the compiler could combine isJust . Just to transform:

my_map A.key_function_1 l1

to

map (\_ -> True) l1

And in the second case knowing that the implementation is id, a smart compiler could eliminate the call to map entirely, as map id x == x (or perhaps fire a rewrite rule that it couldn’t otherwise if the definition of key_function_2 wasn’t known).

I know these are trivial examples that GHC would likely spit into the interface file anyway so the {-# INLINE #-} pragma here is probably redundant, but my point stands, that it seems very useful to be able to “inline” even simple one argument functions to map and map like functions, even though the docs (and others in this thread) say no.

Which means either everyone else is wrong or I’m completely misunderstanding something. From experience when this situation comes up it’s usually the latter.

So what am I completely missing here?

The example does not exhibit a difference between f = g and f = \x -> g x but between f = g and f x = g x, which is disturbing enough.
The real-life example arose when trying to type-specialize a Lens. See this thread. This was particularly upsetting because the documentation shows (not eta-expanded) code that would not compile. Minimal example here.

In the first case you’d inline my_map first

l1_result = map (isJust . A.key_function_1) l1

then (.)

l1_result = map (\x -> isJust (A.key_function_1 x)) l1

then A.key_function_1

l1_result = map (\x -> isJust (Just x)) l1

then inlining isJust causes it to fuse with Just, as you expect.

In the second case you will get key_function_2 inlined, because its LHS has zero syntactic arguments.

Right, I’m surprised by the former, not the latter (I accept that some may find the latter nonetheless disconcerting).

EDIT: Sorry, I misread. As far as I’m aware there is no difference between f = \x -> g x and f x = g x (bar this inlining heuristic) so the distinction between f = g and f = \x -> g x is exactly the same as the distinction between f = g and f x = g x. If you disagree could you elaborate further?

1 Like

I agree, and the minimal example seems to confirm that. Apologies for polluting this thread, since it has little do do with inlining.

2 Likes

That’s OK. Just wanted to make sure everyone was on the same page.

Just a follow up, from the docs:

comp1 :: (b -> c) -> (a -> b) -> a -> c
{-# INLINE comp1 #-}
comp1 f g = \x -> f (g x)

comp2 :: (b -> c) -> (a -> b) -> a -> c
{-# INLINE comp2 #-}
comp2 f g x = f (g x)

The two functions comp1 and comp2 have the same semantics, but comp1 will be inlined when applied to two arguments, while comp2 requires three. This might make a big difference if you say

map (not `comp1` not) xs

which will optimise better than the corresponding use of comp2

https://downloads.haskell.org/ghc/latest/docs/users_guide/exts/pragmas.html#inline-and-noinline-pragmas


Now consider:

myMap = map (not `comp1` not) where 
  comp1 f g = \x -> f (g x) 

vs

myMap = map (not `comp2` not) where 
  comp2 f g x = f (g x) 

Does the inline machinery fail to work on let and where binded inner clauses if the right number of arguments are not turned into lambdas?

As an alternative to these syntactic conditions, staged programming uses the type system to control inlining.

For example, a function of type Code a -> Code b -> Code c will result inlining but Code (a -> b -> c) will not.

If you rely on inlining for the performance of your application, it could be a good idea to look into staging as a means of guaranteeing this.

4 Likes

To answer your question: when you do not specify an INLINE pragma, there is no difference between comp1 and comp2, so both myMaps should lead to identical code.

In the above code, will comp1 be available for inlining?

If so, your answer suggests comp2 will be available for inlining?

And since your answer doesn’t mention anything about the functions being defined in the where clause, so would things still be the same if I did:

myMap = map (not `comp1` not)

comp1 f g = \x -> f (g x) 

or

myMap = map (not `comp2` not) 

comp2 f g x = f (g x)

and if this makes no difference, we’ve concluded that both comp1 and comp2 can be inlined here.

But according to the GHC docs, if we do this:

myMap = map (not `comp2` not) 

{-# INLINE comp2 #-}
comp2 f g x = f (g x)

We don’t get inlining.

Does that mean adding an INLINE pragma can actually prevent inlining? That seems quite counter intuitive. Or is one of my many logic jumps here wrong, if so which one?

Counterintuitive, but yes, for the time being this is correct.

1 Like

Ah, thanks! Does this counterintuitive behaviour apply with {-# INLINABLE #-} as well?