How do I call a value at run time?

Consider this program:

module Main where

main ∷ IO ( )
main = do
  x ← readLn
  let xs = [1..x]
  -- Location 1. `xs` is not yet evaluated.
  print xs
  -- Location 2. `xs` is evaluated.
  return ( )

Statements in a do block are executed in the sequence that they are written. So, locations between statements correspond to moments of «logical time». Further, to every moment, and thus to every location, corresponds a state of memory containing all the values that are in scope at that location.

In the code above, the value xs is visible both from location 1 and location 2. But, due to laziness, the state of memory is not the same at the corresponding moments.

  • At the moment 1, xs is a thunk enumFromTo 1 x.
  • At the moment 2, xs is something like Cons 1 (Cons 2 …), made of data constructors, and free from thunks.

How do I refer to the abstract representation of a value in memory in terms of data constructors and thunks? What is the terminology? What are these things that both represent the same value xs?

I know of a package called ghc-heapview which lets you view the heap representation of a value without evaluating it.

I don’t really understand the other questions. I think you are using the right terminology: thunks and constructors.

Edit: man terminology around calling and referring is confusing in the context of programming. Reading your post again I think you are not looking for that ghc-heapview package. But I think it could still be useful. It seems to use the umbrella term “closure” for all the different kinds of values in memory. Maybe that answers some of your questions?

1 Like

I do not mind getting to know some nice packages. But my question is mainly about the «official» way to refer to these mysterious things.

So, we have thunks and constructors.

  • Is this an exhaustive list of what we may have as a value at a time?

    No, there are also primitive types and who knows what other GHC magic. I only want to talk about laziness and evaluation, so for me only the difference between thunks and every other sort of things matters. Is there a word for everything that is not a thunk?

  • How do I call the sum of all these things? I did hear the word «Closure» being spoken in relation to evaluation in Haskell, but the definition is alien to my mind.

    In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function[a] together with an environment.

    How does this apply to Haskell, if at all?

So far more questions than answers.

How do I refer to the abstract representation of a value in memory in terms of data constructors and thunks? What is the terminology? What are these things that both represent the same value xs?

I guess in lambda calculus, they would be simply called terms. A “thunk” in the Haskell heap would correspond to a term that is a reducible expression, or redex.

Terms that are “fully evaluated” and don’t have any thunks are said to be in normal form. In Haskell, the related concept of “having been evaluated up to the outermost constructor” is called weak head normal form. Everything in normal form is also in WHNF, but not vice-versa.

You can find the relevant terminology in “Making a fast curry: Push/Enter vs Eval/Apply for higher order languages” by Ss M and PJ. The heap objects are

  • FUN (function closure)
  • PAP (partial application)
  • CON (constructor)
  • THUNK
  • BLACKHOLE

As far as I am aware the implementation is reasonably true to this specification.

1 Like

That ghc-heapview package also links through to this wiki page which seems to have all the information and then some: https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/rts/storage/heap-objects. And I think the GenClosure type is an exhaustive list of all the different kind of heap objects.

1 Like

Thanks everyone!

So, two ways to call a value at run time have been offered:

  • A lambda term. The problem here is that I am not sure how constructors would be represented in lambda calculus. In Haskell, a data constructor has the same «arrow» type as a function that actually computes something, but we want to make a distinction, so we should need to extend, say, simply typed lambda calculus with two different flavours of arrows. And then we should either prove that everything still works if we want to call this new construction a lambda calculus.

  • A run time system closure. I have no idea why, but things that live on the heap are called closures in GHC. How about values that live on the stack? Not sure. Overall, this is not abstract enough to be accessible for a commoner. I would like to be able to reason about laziness abstractly — without knowing or caring about how the run time system works.

I am going to eventually read some of the articles supplied in messages above, and maybe a clearer understanding will emerge.

Informally, a value that still might have computation left I would refer to as “thunked” and a value that has been fully evaluated I would refer to as “forced”. A “forced” value would be described more precisely, as above, as “representing an expression in weak head normal form”.

1 Like