Different behaviour of functions when supplied with signature or not


#1

I have the following two functions (I realise they are badly written, and couldn’t possibly work) but they come from an introduction to functional programming and illustrate the need for a base case in recursion.

sum1 l = head l + sum1 (tail l)

sum2 :: [Float] -> Float
sum2 l = head l + sum2 (tail l)

So the only difference apart from the names, is that one has a type signature and the other doesn’t. If I load them in to WinGCHi (although I’ve also tried it on repl.it), then evaluating sum2 [1…10] gives, as I’d expect, an exception, but evaluating sum1 [1…10] hangs until interrupted (see console log below).

Prelude> :load “ss.hs”
[1 of 1] Compiling Main ( ss.hs, interpreted )
Ok, modules loaded: Main.
*Main> :t sum1
sum1 :: Num a => [a] -> a
*Main> :t sum2
sum2 :: [Float] -> Float
*Main> sum1 [1…10]
Interrupted.
*Main> sum2 [1…10]
*** Exception: Prelude.head: empty list
*Main> _

The types of the two functions are basically the same.

I can’t understand why the difference in the evaluations.

John


#2

This is a very good question. Both programs have a similar shape:

sumX l = head l + sumX (tail l)

What happens when this is evaluated? Well, it depends on the meaning of +.

  1. We evaluate head l, then evaluate sumX (tail l), then add them together.
  2. We evaluate sumX (tail l), then evaluate head l, then add them together.

If (1) is used, the program immediately crashes; since taking the head of the empty list is not possible.

However, if evaluation strategy (2) is used, we need to evaluate sumX (tail l), which means evaluating:

sumX (tail l)
  = head (tail l) + sumX (tail (tail l))
  -- Evaluate the right argument of + first
  = head (tail l) + head (tail (tail l)) + sumX (tail (tail (tail l)))
  -- Evaluate the right argument of + first
  = ...

Continuing down this path results in infinite recursion.

Now it’s worth noting that at least from a theoretical perspective, undefined, error and infinite recursion are the same value ⊥ (pronounced bottom), so from this viewpoint it doesn’t matter which argument of + is evaluated first.

So, the remaining question is: how does GHC decide what argument of + it should evaluate first? And the answer is not magic at all, but it is somewhat low-level: it depends on the implementation of + for the type that you give it.

In your code, sum2 works on Float. If we look at the implementation of + for Float, we see that we first case match the constructor F# on both sides; which results at the left one being matched first, resulting in the error you see.

sum1 does not have a type signature, which means it is polymorphic. If you execute this code with the argument [1 .. 10], GHC will default [1 .. 10] to be [1 .. 10] :: [Integer] (arbitrary precision), and not e.g. [1 .. 10] :: [Int] (32- or 64- bit int).

If we chase down the implementation of + for Integer, we see that here we indeed try to match the right argument of + first, resulting in infinite recursion.

Hope this helps!


#3

Thanks for that. That helps a bit, but I’ve never been so ‘low’ in Haskell before. In the implementation of + for integers is it the line

plusInteger x (S# 0#) = x

that causes the evaluation of the right hand argument first?

Am I right in thinking that if I pass an empty list to the function sumX evaluation strategy 2 does not give an error on evaluating (tail []) because of lazy evaluation, i.e., it builds up the chain of tail calls but never actually gets around to evaluating any of them?

John


#4

In the implementation of + for integers is it the line

plusInteger x (S# 0#) = x

that causes the evaluation of the right hand argument first?

Yep! This line case matches on the right hand argument, to match the constructor S#. In order for a Haskell runtime to evaluate whether or not this match succeeds, it needs to reduce the right hand argument to a single constructor (commonly called WHNF, Weak-Head Normal Form). It is during conversion to this constructor that the infinite recursion occurs.

Am I right in thinking that if I pass an empty list to the function sumX evaluation strategy 2 does not give an error on evaluating (tail []) because of lazy evaluation, i.e., it builds up the chain of tail calls but never actually gets around to evaluating any of them?

Yes, that is also completely correct!


#5

Many thanks for your help with this. It’s taken me to parts of Haskell I’ve never known existed.

My reading of the various cases for plusInteger is that they do optimisation, e.g., x + 0 is transformed to zero, 0 + x to X and so on. Is that right?

John


#6

Just another thought or two.

I suppose the reason that + can have different ‘orders’ of evaluation depending on the type without causing any problems with commutativity is because of the lack of side effects?

And if the compiler didn’t have those two optimisation lines for Integer + the third case starting

plusInteger (S# x) (S# y)

would be the first to apply, which would evaluates arguments exactly as Float + does.

John


#7

I think you are right – although the lines may be necessary (like some sort of base case) and not just an optimization; I didn’t dig into it that deeply.