I thought we were discussing matters pertaining to (ordinary) strictness, not hyper-strictness…
Ah, then I think this whole discussion has been one big misunderstanding. The answer to @sgraf’s question:
Is simply that you aren’t allowed to put the annotions nested in a data type like that.
You could only use it on function arguments like this:
tuple :: !Int -> !Int -> (Int, Int)
tuple x y = (x, y)
And perhaps to variables like:
x :: !Int
x = 1 + 2 + 3
But that is difficult if that is a top-level binding (I have worked on that problem during my internship with Well-Typed).
Lazier-ness in action:
ghci> map (const '\a') [let e = error e in e, let s = '\a':s in s]
"\a\a"
ghci>
So map
is indifferent to its arguments being fully evaluated or not (or partially so). That would be modulated/adjusted/“fine-tuned” by using strictness annotations (however they may be implemented) as needed to obtain the result desired by the programmer.
And perhaps to variables like:
x :: !Int x = 1 + 2 + 3
But that is difficult if that is a top-level binding […]
So wouldn’t that also be a problem for -XStrict
?
You could only use it on function arguments like this
Regarding this idea specifically, I have an old article on a related issue.
State monad - memory exhausted
-
Why did this happen?
- Because GHC uses a GC-collected heap, and GC can’t reclaim everything when needed.
-
The simple solution?
-
Dump the GC, and maybe even the heap as well:
…all going well: no need for strictness annotations,
Unlifted
,StrictT
and the rest!
-
Dump the GC, and maybe even the heap as well:
Pardon my naivety, but the paper is from 2011 and the non-moving GC was merged in 2020, surely those 30 pages of lambda calculus don’t just solve one of the cornerstone problems of compiler design and noone noticed it? I unfortunately don’t have a decade of necessary background to address the paper directly (or to even be able to read it properly).
I commented on that paper here What about having StrictData and non-strict semantics on functions as the default programming style? - #13 by sgraf. TLDR; I don’t think it is of implementational relevance.
So wouldn’t that also be a problem for
-XStrict
?
Yes, -XStrict does nothing with top-level bindings. E.g. this compiles and doesn’t cause infinite loops:
{-# LANGUAGE Strict #-}
fibs :: [Integer]
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
main = print (fibs !! 10)
Ah, it’s also in the documentation:
Top level bindings
are unaffected by
Strict
. […] Reason: there is no good moment to force them, until first use.
[…] surely those 30 pages of lambda calculus don’t just solve one of the cornerstone problems of compiler design and no-one noticed it?
I don’t think it is of implementational relevance.
Prior to the appearance of these these 35 pages the choice to use the push/enter or eval/apply styles was largely arbitrary. But in GHC, the eval/apply implementation did avoid a irritating source of complexity:
(page 3)
push/enter requires a stack like no other: stack-walking is more difficult, and compiling to an intermediate language like C or C-- is awkward or impossible.
and primarily for this reason, the push/enter implementation was abandoned. Now just imagine what could also be abandoned if GHC (like proto-Rust eventually did) no longer had GC…
[…]
-XStrict
does nothing with top-level bindings.
Alright, back to this example of yours:
x :: !Int x = 1 + 2 + 3
…if this was difficult, can I at least assume that something like
x :: Int#
x = ...
or:
x :: Unlifted @!%$&# ...
x = ...
would be similarly as awkward?
Yes, the unlifted part is what I have worked on (!10841). The main issue is that unlifted variables must never be thunks, so they need to be fully evaluated before the program starts. And GHC has no way of evaluating code at compile-time in a reliable way (constant propagation and similar optimisations don’t give any guarantees).
So I started out with the idea to allow only constants be bound in this way, so instead of 1 + 2 + 3
you would have to write I# 6#
(or more precisely the unlifted equivalent of that). Even number literals are a problem because they are desugared to fromInteger
function calls.
At the moment it has stranded a bit on implementing the required semantics in the GHCi bytecode interpreter, which does not have the ability to allocate a Haskell value as static data. It currently just makes every top-level variable a thunk. The native back end did already have that ability as an optimization, so I initially thought it would be pretty easy.
So why are we here yet again discussing the relative merits of unlifted/unboxed types vs “unlifted classes/families” vs strictness annotations vs etc, etc, etc, etc :
State monad - memory exhausted
-
There is an infinite number of ways for GC-based Haskell programs to “leak space”.
-
Haskell implementations like GHC are finite programs.
-
Therefore it will never be possible for any Haskell implementation to “plug all of those leaks”.
So if you want a “leak-free” Haskell, then you want a Haskell without a garbage-collected heap - all non-trivial attempts to do otherwise will ultimately be futile.
For example, and as successful as it was, optimistic evaluation would still be prone to retaining too much space. If speculative evaluation is interrupted “too early”, then an overly-spacious thunk may not be replaced with its more-compact result.
In the context of I/O models:
(page 2 of 4)
While concurrency is an important tool in real-time programming, being forced to use it just to circumvent an inappropriate I/O model is not satisfactory.
Reactive Objects (2002)
Similarly, needing to use advanced evaluation techniques just to circumvent the glaring deficiencies of garbage-collected heaps is also unsatisfactory.
I don’t believe it is possible to have a practical Haskell without a garbage collected heap. At least not unless you want to fall back on manual memory management like Rust has.
The only evidence you’ve provided that it would be possible is a paper that describes a “storeless” and “heapless” semantics in a theoretical sense. As far as I understand the semantics they describe are transforming the programs themselves. So instead of allocating to a heap it will just make the program larger. And I see no mention of memory reclamation, so that would be equivalent to simply never running the GC in a Haskell program. Then pretty much every program will be one big leak.
Take for example this trace from Section 2:
This is a storeless semantics and as you can see the program pretty much only ever grows. And notice the y
that is shadowed on the last two lines which could be reclaimed.
They unfortunately don’t show any traces of the actual storeless abstract machine, but I see no reason why it would behave differently in this respect.
Now read page 7 of 34…
Ah they do mention reclamation:
This let expression is needed as long as z occurs free in its
body; thereafter it can be elided with a garbage-collection rule [18].
So you’re proposing to replace a garbage collected heap by… a garbage collected program?
Unfortunately that cited paper doesn’t seem to be available even in the library of my university or the worldwide libraries I can search.
Unfortunately that cited paper doesn’t seem to be available even in my university library.
If CSx is working again:
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.51.5026&rep=rep1&type=pdf
…otherwise:
https://archive.org/download/citeseerx-csx_citegraph.2017-03-31/citeseerx_checksums.tsv.gz
…then search for "10.1.1.51.5026"
:-/
… a garbage collected program?
You’ll need to be more specific:
-
a program which incrementally “tidies up” after itself as it runs, “incremental-GC” style;
-
or a program which never allocates, instead reusing its own heap space:
You’ll need to be more specific:
I’m asking how you envision an alternative to garbage collection and how that would solve the problems of leaks. But here’s my thoughts on those two options:
a program which incrementally “tidies up” after itself as it runs, “incremental-GC” style;
I could imagine a Haskell with incremental GC, perhaps using the new reference counting, functional but in place (FBIP), approach. But I fail to see how that would prevent leaks. I believe incremental GC would only really be a solution to GC pauses. It’s not like Haskell programs leak because the GC doesn’t run often enough.
or a program which never allocates, instead reusing its own heap space:
I could also imagine a fully linear Haskell, but then that would basically mean manual memory management like Rust has. I wouldn’t want to manually manage my memory like that in all my programs.
Since the SML source code for the prototype is no longer at the URL listed at the bottom of page 4:
http://www.zerny.dk/def-int-for-call-by-need.html
…I’m left with no other alternative but to build an all-new prototype - just not in SML, which is also GC-based. It’s another reason for my interest in Rust, and the appearance of its second compiler.
But I’ve been wrong in the past, so why should this time be any different? It could be that sometime in the middle of the year, if I have a working prototype in a non-GC language, it ends up leaking stack space instead of heap space - darn. Fortunately, there’s another option, courtesy of one Robert Ennals:
https://web.archive.org/web/20130610122750/https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.9.9901&rep=rep1&type=pdf
…and this was a working system! But for reasons which I’ve yet to see explained satisfactorily, it went nowhere (then Robert went elsewhere). If all the old experimental versions/branches are still available for GHC, I believe speceval2
contains the actual implementation.
GHC has no way of evaluating code at compile-time in a reliable way
On a tangent, are there any plans to address this? The inability to type-check and properly pack literals is a very weird pain point, and I don’t like the approach of incorrect instances plus RULES
pragmas as a solution.
Personally, I think Template Haskell is the easiest way to address it. You’d write something like:
x = $$(evalTH [|| 1 + 2 + 3 ||])
But another option is to identify a subset of the language that can be evaluated at compile time, like constexpr in C++ or constant evaluation in Rust.
But perhaps it would also be interesting to explore the possibility of having actual guarantees about optimisations that GHC performs. Such that we can simply rely on the existing optimisations to do this evaluation for us.