Oh, absolutely, thanks!
With your approach of using the TH Name
, doesn’t that lead to problems when one name is used twice, in an abstracted-out definition?
number :: Int -> Parser Int
number b = 'number
::= (\x y -> b * x + y) <$> number b <*> digit b
<|> digit b
decimal :: Parser Int
decimal = number 10
binary :: Parser Int
binary = number 2
In that case, doesn’t your library get confused that there are two different non-terminals (decimal
and binary
) that use the same Name
? (At least if they are mutually recursive?)
There is indeed a problem in a similar case, but I think the example you give is fine because the only thing I want to detect are loops and decimal
and binary
do not create extra loops.
But the thing that I have no solution to yet, is when there are higher order combinators involved such as maybe
:
maybe p = 'maybe ::= Just <$> p <|> pure Nothing
If you naively write it like that and then use it as maybe (maybe (char 'a'))
, then gigaparsec will currently think there is a loop (because it encounters two 'maybe
s without consuming any input). I hope to figure out a way to write something like:
maybe p = app 'maybe (name p) ::= Just <$> p <|> pure Nothing
Which would then allow me to differentiate between the inner maybe and the outer maybe, but I haven’t figured that out yet.
Right, the example was incomplete, hence the “if they were mutually recursive”.
I fear that using TH names is still going to have most of the problems of manually assigning names. It’s great in the first-order case, but as soon as you introduce abstraction, it becomes problematic.
I should still figure out how to apply the rec-def principle to parsers. We only need special care due to left recursion, and for right recursion normal laziness would just work, right?
Looking quickly through the references: you’ve already included the main articles regarding observable sharing in hardware-description languages - that leaves two other possible directions to investigate:
-
Jennifer Hackett and Graham Hutton’s paper about “an alternate semantics for laziness”, to avoid the complicated reasoning associated with using heaps for this task.
-
As for
unsafe
pseudo-definitions being used (more) safely, there was a long thread here about it, with this probably being theleast-opininatedmost useful of my posts there for your purposes:
Thanks! That paper “Call-by-need is clairvoyant call-by-value” is super helpful, maybe Clairvoyant Call by Need indeed solves my problems.
@Lysxia, you formalized Clairvoyant Call by Need in Coq… what do you think? (Based on your paper you did not consider non-structurally-recursive definitions, so probably no knot-tying in your formalization, right?)
The paper doesn’t give the denotational semantics for the language extended with data types, but I can guess what it would look like:
- adding a (simple)
S x
constructor to the syntax, - changing the value domain to
V ≅ (V_⊥ → D) + D
to (which I’ll write asS d
as well) - Adding the equation equation
[[ S x ]]ϱ = 0 ▹ S (ϱ)
Now the denotation of
let x = S x in x
seems to be, if my scribblings are correct, 2 ▹ (µ x. C (0 ▹ x))
, while for
let x = λ y. S (x y) in x e
(for arbitrary e
) I get 6 ▹ (µ x. C (4 ▹ x))
(the cost counters may be off a bit).
So if “cost zero” means “ value is shared”, then indeed this denotational semantics allows me to distinguish knot-tying vs. untied knots!
(And probably I can simplify by only keeping track of 0 vs. not-0 in the semantics.)
What I am unsure now: Can I introduce an operation rseq x
that is ()
if x
evaluates to a cyclic data structure, and ⊥ else? It seems that “evaluates to a cyclic data structure” means “can be evaluated to a (infinite) tree of constructor applications where the overall cost is finite” or equivalently “all but finitely many costs are 0”. I can certainly define an operation D → D
that way, but I worry: is it continuous?
It seems not. In this particular partial order we have
(µ x. C (1 ▹ x))
⊏ C (0 ▹ (µ x. C (1 ▹ x)))
⊏ C (0 ▹ C (0 ▹ (µ x. C (1 ▹ x))))
…
⊏ (µ x. C (0 ▹ x))
but rseq (µ x. C (1 ▹ x)) = ⊥
and C (0 ▹ (µ x. C (1 ▹ x))) = ⊥
and also (µ x. C (0 ▹ x)) = ()
. This is monotonic, but not continuous.
So we cannot “simply” add this operation to the language. Too bad.
I presume this is the paper you’re referring to:
and one I had forgotten about - it appeared in a search while looking for ways to solve another problem.
Regarding rseq
: how would it have worked if it was applied (indirectly) to itself? It seems to vaguely resemble a termination checker…
Vaguely, and that is surely related to the troubles we are having.
But in the operational setting with an explicit heap we can see that it detects finite graphs (and doesn’t terminate for infinite graphs), which is fine. If it is used recursively, it’ll just diverge.
But the denotational setting doesn’t quite work. Maybe because the finite graph is infinitely unrolled, and recognizing that some infinite tree came from a finite graph is now indeed too close to a termination checker, and that’s why it doesn’t work yet?
Several years ago, I noticed this small article:
While the notation is more “machine-like”, perhaps the technique described could be useful for more precisely specifying which graphs are suitable for the intended purpose.
Actually, someone just pointed out to me that I was too pessimistic: If some function in the semantic domain isnt continuous, but still monotonic, that’s still a well-defined semantics! Just not so computable any more, but that might be ok. (Monotonic functions on CPOs have least fixed points, but if the function isn’t continuous, the fixed-point may not simply be the limit of fⁿ(⊥).
So maybe I am on the right track. Will be thinking more about this.
What is the definition of rseq
? You give examples, but no closed definition. Perhaps it is
rseq = λd. if d = μ x. C (0 ▹ x) then C (0 ▹ ()) else ⊥
But that lacks detection of “finitely many non-0s”.
Is that continuous? I don’t think it is, but I didn’t really see a proof above.
Here is one: rseq (C (0 ▹^n ⊥)) = ⊥
, so rseq (Lub_n {C (0 ▹^n ⊥)}) = rseq (μ x. C (0 ▹ x)) = () /= ⊥ = Lub_n {rseq (C (0 ▹^n ⊥))}
.
But if it isn’t continuous, I’m a bit doubtful it is of use.
Since you are calling rseq
on something involving a least fixed-point, at some point you want to do fixed-point induction. Then you are out of luck, because it is unlikely that a predicate involving rseq
is admissable (which is continuity on D -> Prop
).
Of course, you might always have concrete examples in mind where you can unroll the fixed-point once to see that the “remainder” it is something like μ x. C (0 ▹ x)
and then you can apply rseq
.
A closed definition is somewhat tricky to write down precisely, because rseq
has to look at the whole expression. In prose, the definition is
rseq d
= () if d consists only of (possibly inifintely many) applications of C
and only finitely many cost annotaitons are not 0
= ⊥ else
It is monotonic, but not continuous. I think the counter example you give to the one here, with a sequence of values where rset
is bottom on all these elements, but ()
in the limit.
But if it isn’t continuous, I’m a bit doubtful it is of use.
That’s what I thought, but Peter Thiemann pointed out that all you lose is that the least fixed point can no longer be found as the ω-limit of the iterating from bottom, but it still exists (as per some variant of Knaster–Tarski).
And you are right: one loses loses the simple fixed-point iteration as a proof principle. (Quick counter example: ∀ d, rseq d = ⊥
should hold according to that proof principle, based on our examples, but of course it doesn’t.) Maybe there is an analogue proof principle based on transfinite induction. I’ll start worrying about proofs once I wrote down the semantics (I have some promising ideas and might write them down today.)
How about this definition: rseq
is the largest element of D →D
such that
rseq (C (c ▹ x)) = c ▹ rseq x
rseq ⊥ = ⊥, else
and also
rseq d ⊑ 0 ▹ ()
holds. Note that here c₁ ▹ c₂ ▹ = (c₁ + c₂) ▹ …, so this accumulates the costs, so one gets rseq d = n ▹ ()
if the costs add up to n
, but rseq d = ⊥
otherwise. The last equation ensures that we don’t get a value other than ()
in the success case.
That we are using a largest fixed-point here looks rather promising, this does have a coinductive feel to it after all.
I started a document at https://www.joachim-breitner.de/publications/rec-def-denotational.pdf. Right now it mostly contains how I can put RB.get
into a denotational semantics (Treat RS.mk
, RS.and
etc. as normal constructors, and then solve the infinite formula in RS.get
in one go).
The rseq
discussion seems to be rather orthogonal actually, although I need both in the end to model the behavior of rec-def
.
How about this definition:
rseq
is the largest element ofD →D
such that
Yes, that seems plausible. One could view evaluating the heap as a coinductive process in which you yield back the cost of the evaluated heap bindings one after another.
On the other hand, rseq (µ x. C (1 ▹ x)) = (µ x. C (1 ▹ x))
, so that is different to what you had before (although I’m quite faint on how (µ x. C (1 ▹ x))
is actually represented in the domain), but in a good way: I think the resulting definition is continuous:
rseq (Lub_n {C (c ▹^n ⊥)})
= rseq (μ x. C (c ▹ x))
= (μ x. C (c ▹ x))
= Lub_n {C (c ▹^n ⊥)}
= Lub_n {rseq (C (c ▹^n ⊥))}
So if that definition of rseq
works for you, it is likely easier to handle.
I don’t think that’s correct. By the above equations we have
rseq (µ x. C (1 ▹ x))
= rseq (C (1 ▹ (µ x. C (1 ▹ x)))
= rseq (1 ▹ (µ x. C (1 ▹ x)))
= 1 ▹ rseq (µ x. C (1 ▹ x))
(where I’m a bit liberal with applying rseq
to either D = (ω^{op} × V)_⊥
or V
, see Section 5 of the Clairvoyant paper for details on the semantic domain).
But the equation d = 1 ▹ d
only has one solution in this domain, namely d = ⊥
, so we get
rseq (µ x. C (1 ▹ x)) = ⊥
as desired!
Just for comparison, for the tied knot (µ x. C (0 ▹ x))
I have
rseq (µ x. C (0 ▹ x))
= rseq (C (0 ▹ (µ x. C (0 ▹ x)))
= rseq (0 ▹ (µ x. C (0 ▹ x)))
= 0 ▹ rseq (µ x. C (0 ▹ x))
= rseq (µ x. C (0 ▹ x))
and the equation d = d
has many solutions, but the largest in the range of rseq
(which is restricted to {d ∈ D; d ⊑ ⊑ 0 ▹ ()}
) is indeed ()
, as desired.
Ah, re-reading what ▹ does (it is not the lifted pair constructor but an operation that adds the LHS to the cost of the RHS) and realising that C
denotes a constructor value, that makes a lot more sense.
But then I’m a bit confused about the meaning of rseq (C (c ▹ x)) = c ▹ rseq C x
.
Would you say that rseq (c, C d) = c ▹ rseq d
is an accurate clarification?
Then the interesting things look like μ d. (c, C d)
and under rseq
that is only ()
for 0, as you say.
Sorry, that was a typo, should have been rseq (C (c ▹ x)) = c ▹ rseq x
, as you say. My bad.
There might be another technical problem. If we allow monotone, but non-continuous functions in the semantics, we have to re-interpret the domain equation
D = (D → D)_⊥
(here for just lambda calculus, no cost accounting) because now D → D
need to not only hold continuous functions, but any monotonic function.
This raises the question: Do recursive domain equations involving the space of monotonic function still have solutions?
I’m a bit worried that texts like, for example, “THE CATEGORY-THEORETIC SOLUTION OF
RECURSIVE DOMAIN EQUATIONS” by Smyth and Plotkin does not list “DCPOs with monotonic functions” as an example of categories where such domain equations can be solved. Let’s escalate to stack exchange.
A search with solutions recursive domain equation space "monotonic functions"
found this article:
Note: the detailed description/example of the technique makes use of an imperative language (albeit one which is deliberately minimal) - you’ll need to determine its suitability for a large non-strict language like Haskell.
Thanks! Pottier again, he seems to pop up in every second related work However
we cannot do this using “off-the-shelf” techniques for solving recursive domain equations, like those of Rutten [12] or Birkedal et al. [13]. Instead, we obtain a solution by explicitly constructing the inverse limit of an appropriately chosen sequence of approximations …
doesn’t sound very encouraging, I fear (not that I fully understand that paper.)
From a rummage through my ol’ bookmarks file:
-
Encapsulating non-determinacy in an abstract data type with determinate semantics (1991).
-
Permission-Based Ownership: Encapsulating State in Higher-Order Typed Languages (2005).
Perhaps something in those could be useful…