Figuring out why a seemingly alternative definition causes stack overflow

I am doing some exercises where I will generate an infinite sequence such that the nth number is the maximum power of 2 that divides into n. Hints ask me to first implement an auxiliary function interleave which picks elements from two lists alternately. For example, interleave [1,2,3] [4,5,6] produces [1,4,2,5,3,6]. So I write a function mySeq n = interleave (repeat n) (mySeq (n+1)) and the desired answer is let n = 0 in mySeq n. The overflow issue is in the implementation of interleave.

The one that overflows is

interleave :: [a] -> [a] -> [a]
interleave [] y = y
interleave x [] = x
interleave (x:xt) (y:yt) = x : y : interleave xt yt

The one that works fine is

interleave :: [a] -> [a] -> [a]
interleave [] y = y
interleave (x:xt) y = x : interleave y xt

What causes the difference?

3 Likes

Since this appears to be a homework question…one of the nice things about Haskell is that you can use a “pencil and paper” approach to evaluating Haskell definitions. I suggest starting with the intended answer:

    let n = 0 in mySeq n
==> mySeq 0               {- binder substitution -}
==> ...                   {- definition of mySeq -}
    â‹®

and work from there. If you do obtain the answer, just tell us that rather than presenting the actual answer here.

Interesting puzzle! I was confused by the difference between the two functions; they look like they should be identical. But the first one can’t make progress until it has evaluated both input lists. This alternative can make progress:

interleave :: [a] -> [a] -> [a]
interleave [] y = y
interleave (x:xt) yts = x :
  case yts of
    [] -> []
    y : yt -> y : interleave xt yt
3 Likes

Thank you! Now I know that the pattern matching causes the problem. What a subtle difference! I think I could not even notice that until I have made this mistake. Your alternative solution really helps.

What exactly do you mean with “stack overflow”? As far as I know GHC uses an infinite stack, so I didn’t think it was possible to overflow.

To test this out I wrote this program:

interleave :: [a] -> [a] -> [a]
interleave [] y = y
interleave x [] = x
interleave (x:xt) (y:yt) = x : y : interleave xt yt

seqList (x:xs) = x `seq` seqList xs
seqList [] = ()

main :: IO ()
main = seqList (interleave [1..1000000000] [1..1000000000]) `seq` pure ()

And that runs fine. It takes about 25 seconds to finish on my machine, but no overflow is happening.

Your other “working” version is slightly faster at about 20 seconds, but that’s not a very big difference.

I guess I am missing something. To properly display the infinite list I am trying to generate, I test it using take n $ mySeq 0. The overflowing one causes overflow even if n is 1. The other one works fine. So I guess the overflowing one gets trapped into infinite recursion before it can work out the head of the list. Finite lists cannot lead to the problem as they can be finally worked out.

1 Like

Ah, my bad. I completely overlooked the first half of your post.

And in GHCi I indeed also get a stack overflow. If you compile the program using ghc -O2 then it will just consume all your memory and crash that way without a stack overflow.

This is not causing your problem, but it bugs me that mySeq is accumulating a thunk. I would use the BangPatterns extension and apply a bang pattern to the argument of mySeq:

{-# LANGUAGE BangPatterns #-}

mySeq !n = interleave (repeat n) (mySeq (n+1))

The reason that doesn’t cause a problem is that the output numbers are small, since powers of two only grow logarithmically in the length of the list. Also if you print all the output, it would force evaluation of the thunks as it went along anyway. But it’s just in general a good practice to force evaluation of an accumulator.

You can also do it without BangPatterns:

mySeq n = n `seq` interleave (repeat n) (mySeq (n+1))
1 Like