# How does this nth Prime program terminate?

So here is a solution I found for finding the nth prime. I’m trying to figure out how it ever stops given its use of `[2..]` in `primes`. Nothing seems to stay at what point it should stop.

I’m also wondering of the call `sieve [2..]` is sending 1 number at a time which I think it is because it can’t send infinity.

``````main :: IO ()

nthPrimeIdx :: Int
nthPrimeIdx = nthPrime - 1

nthPrime :: Int
nthPrime = 10001

primes :: [Int]
-- primes = sieve [2..]
primes = sieve [2..]

sieve :: [Int] -> [Int]
sieve [] = [] -- added
sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p /= 0]
``````
1 Like

At a high level, note that Haskell is lazy. So it’s very common to see infinite data structures in finite time computations. For example `take 10 [2..]` terminates just fine.

To understand why this particular example terminates, note that lists are constructed lazily. `primes` is certainly an infinite list, but let’s say you ask for the 1st element of it, i.e. `head primes`. To compute this, Haskell starts evaluating `sieve [2..]`. It uses the equation `sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p /= 0]`. This immediately gives that the first element of `primes` is `2`, so we don’t need to evaluate any more.

Finding the nth element works similarly.

2 Likes

Still not seeing it :(. I’ll give you my thinking and maybe you’ll see what I’m missing.

I’m assuming that nothing mythical is happening and at some point one of the recursive calls to sieve returns [] so that is the end. But if its supply (`[2..]`) is infinite how does it ever get smaller.

Questions in my mind:

• How `[2..]` interacts with `(p:xs)`.

• When `sieve` is clled with `sieve [2..]` What gets sent to `sieve`.

• Is it `2`, `2:[]` or `2:[some number of ints in here]`.
• Maybe `2:[3..]` - which still looks infinite

• Is there some way that `sieve [2..]` knows how many Ints to send?

Not saying you are wrong but in my, perhaps incorrect, understanding, that looks very different to what we have. `head [2..]` is `2` and that is finite.

Also since `head [2..]` is `2`, `sieve 2` is an error as expected. `sieve [2]` works but again is finite.

When `sieve` is called with `sieve [2..]` What gets sent to `sieve`?

Good question! What gets sent to `sieve` is `(2:infiniteRestofList)`, in other words, the entirety of `[2..]`.

So `sieve` receives that and applies the equation `sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p /= 0]`. So far so good?

This means that `sieve [2..]` equals `2 : someMoreStuff`. Agree so far?

So if I were to ask for `head (sieve [2..])`, I would get `2`, and the `someMoreStuff`, which you rightly suspect is an infinite list, would never get evaluated.

2 Likes

Of course, there’s more to the story, because if I asked for `head (tail (sieve [2..]))`, I would have to start evaluating `someMoreStuff`, where `someMoreStuff = sieve [x | x <- [3..], x mod 2 /= 0]`.

But crucially I wouldn’t have to evaluate all of it. In particular, I’d now start evaluating `sieve [x | x <- [3..], x mod 2 /= 0]`, again using the definition of `sieve`.

To do this, I would need to determine whether `[x | x <- xs, x mod p /= 0]` is of the form `[]` or the form `p : xs`, to determine which pattern to match. So I have to start evaluating `[x | x <- xs, x mod 2 /= 0]`. Doing so reveals that the first element is `3`, so I can now use the equation `sieve (p:xs) = p : sieve [x | x <- xs, x mod 2 /= 0]` to calculate that `someMoreStuff = 3 : evenMoreStuff`.

And so on.

This kind of this is indeed quite surprising when first seen, so your confusion is very understandable

I would like to add to the excellent explanation of @reuben, that the recursive function `sieve` only terminates because ultimately we retrieve a specific element at a specific position and discard the “rest”. `sieve` does not terminate in that the “recursion gets smaller” as recursion often does. To the contrary, the “rest” is infinite, and does not terminate. Maybe that helps?

Also, I would like to say that these Haskell classical examples are slow compared to other algorithms available! So don’t compute your prime numbers using `sieve` .

1 Like

The same basic mechanism is used in Haskell, but:

• the order of evaluation isn’t (usually) so sequential;
• it isn’t limited to logical expressions;
• and it’s part of the implementation.

So when `(!!)` finally counts down to zero it also has no further need of the rest of `primes`, which it then discards.

I think the best way of thinking about it is that `sieve [2..]` literally sends `[2..]`, not some expanded form but just the unevaluated form `[2..]` which is a finite representation of the infinite list (I’m able to write it here in this comment after all!).

When your program pattern matches on this infinite list it computes just enough to do the matching, so it indeed becomes `2:[3..]`, but that is still a finite representation of an infinite list.

That is why it is called lazy: it only generates elements of the infinite list if it is forced to do so (by a pattern match), otherwise it will just procrastinate and pass along the unevaluated form.

2 Likes
``````alt = sieve [2..] !! 2
``````

That’s a lesson I’ll remember.

I really didn’t see that. I think in my mind I’m still looking for some kind of if…then, for, while etc.

May I shamelessly plug

where I try to get to the bottom of this (heh) using a tool that inspects the heap of the running Haskell program

7 Likes

Ghc-vis and ghc-heap-view are awesome. I’ve been playing with the idea of combining ghc-vis or something similar with -finfo-table-map to print thunks and function pointers, and to resolve the relevant core code. It sort of work but only with compiled code because GHCi does not populate the info-table-map and the breakpoint source locations aren’t externally accessible either.

But the core for sieve is a mess so maybe it wouldn’t be helpful anyway

Maybe the GHC-core code for the infinite list might help with intuitions? It’s an intermediate language which GHC uses and which is a bit more explicit.

``````Rec {
primes_go3
= \ x_srPd ->
let {
sat_srPh
= case x_srPd of wild_srPf {
__DEFAULT ->
case +# wild_srPf 1# of sat_srPg { __DEFAULT ->
primes_go3 sat_srPg
};
9223372036854775807# -> []
} } in
let { sat_srPe = I# x_srPd } in : sat_srPe sat_srPh
end Rec }
``````

Here, the let bindings correspond pretty closely to allocations on the heap. You can translate this pretty directly to imperative code, where thunks are zero-argument functions which cache their results.

``````function mkList(i) {
console.log("mkList", i);
function thunk() {
if (this.result != null) { return this.result; }
this.result = mkList(i+1);
return this.result;
}
p = i;
return [p, thunk];
}

> a = mkList(2)
mkList 2 debugger eval code:2:13
Array [ 2, thunk() ]

>a[1]()[1]()[0]
mkList 3 debugger eval code:2:13
mkList 4 debugger eval code:2:13
4

>a[1]()[1]()[0]
4
``````
2 Likes

There is a nice video about this question by Graham Hutton: Infinite Data Structures: To Infinity & Beyond! - Computerphile - YouTube .

2 Likes