What is the best way to start a GHC proposal

Suggestions

  • You could start a discussion right here on the Haskell Discourse, describing your idea and seeking feedback.
  • I think it would not be long before you want an actual document that people can comment on, and that you can successively refine. I often use a Google doc for this. Others use HedgeDoc or some other wiki-like thing.
  • At some point you could then open an issue on the GHC Proposals GitHub, as a home for a more focused discussion, still pointinting to your google doc.
  • After that you could turn it into a Proposal, by making a *pull-request" on the GHC Proposals GitHub.

I strongly urge having a document that describes your design as concretely as possible. Otherwise the danger is that you have long discussions in which everyone talks past each other because there isn’t enough precision.

Happy new year!

Simon

11 Likes

I’m afraid you are right there :slight_smile:

2 Likes

That makes sense. Thanks.

I’d be happy to collaborate on something like this, I have also been thinking about it for a long time. I actually have a partially-written proposal which you might find interesting: Pure Template Haskell - Google Docs

5 Likes

That’s great. My concern is not much about IO, but more about the stage restriction and the ability to use code defined in the same file.

2 Likes

I’ve been thinking about something similar to this Pure TH thing recently as well.

I was initially thinking about something similar, but now I think trying to rule out IO is the wrong way to go.

As you mention in your draft proposal, unsafePerformIO still exists. FFI is often another form of hidden IO, and is also very difficult to avoid.

A different form of impurity that isn’t mentioned is being able to access compiler internals by unsafeCoerce-ing into the typechecker monad, eg, Haskell dark arts, part I: importing hidden values.

I think a better approach might be to run TH using the external interpreter inside a container-like environment without networking and only the files it should be able to see. This would disallow access to typechecker internals, as we no longer run in the main process; would disallow most inappropriate uses of IO; and doesn’t require the addition of syntax, or a change to user code.

1 Like

Something like that would also be useful for FFI calls purporting to be free of I/O activity - an exception can then be raised if the foreign call attempts surreptitious input or output. To disable this, the offending FFI declaration would have to be annotated e.g. with pseudo or feign. If the likes of unsafePerformIO and unsafeCoerce could then be relegated to annotated FFI declarations, Safe Haskell (or its successor) could then ban all feign/pseudo foreign calls.

1 Like

Ideally I would like to be able to write things like

-- A macro 
$((
zipFields :: String -> [String] -> String
zipFields f fields = $((
      intercalate " "  ["($f  ($field a) ($field b))" | field <- fields]
))

data Point = Point { x :: Double
                   , y :: Double
                   }
addPoint :: Point -> Point -> (Double -> Double) -> Point
addPoint a b = Point $((zipFields "(+)" ["x", "y"]))

This is a contrieved example which probably could be writen using Generics
but it shows the type of macro which could be written.

Another simple example is to do calculation at compile time. In

day = 24*60*60

day might be recalculated everytime it is used whereas in

day = $(( 24*60*60 ))

day is calculate at compile time.

The basic idead is to have splices generating a string, instead of an AST (an Exp or Type etc …)
A first pass would extract all slices and execute them (has normal haskell code) and replace them by their result.
The expanded all file would be then be parsed normally

Generating a string to pe parsed vs an AST Exp` has many main advantages.

1- Writing a string is closer to the target code than a expression.
It is easier to write

"(f $a $b)"

than

AppE (VarE $ mkName "f") (AppE (VarE $ mkName a) (VarE $ mkName b)

The string is just Haskell. The Exp is a new language which need to be learned.
My Exp is probably wrong, which proves the point.

2- An inconvenient of writing an AST is that it exposes the internal of GHC, which are quite complex and break
Template Haskell codes every time something is added to GHC. This especially true for Type declaration.
This problem doesn’t exist with string. If something is added to GHC which doesn’t change the syntax,
no change in the macro are needed.

3- Quasi quoting try to offer a similar interface, but only works if all the terms are knows. Also
it has generates “capture/scoping” issue, does a variable name belongs to generating or generated code ?
With strings, name within string will be resolved later on.

4- To be able to typecheck the AST, a splice needs to have all identifier in scope which means that
a file is compile by section. I believe this problem is called (or related to) “stage restriction”.
Same as 3.

The argument for splices generating an AST is it generates something which typecheck.
The counter argument is the generated macro will be typechecked at the next phase, so it is not a
problem in practice. Another counter argument is people who want “pre-typechsecking” can still use TH.

This is orthogonal to pure/impure Template Haskell but could of course be linked to it.

2 Likes

This is a contrived example that excellently demonstrates the mental stumbling blocks one introduces when trying to do metaprogramming without types. What is the type of zipFields? What is the type of addPoint? The types you’ve written for these two declarations make no sense whatsoever—you clearly didn’t write what you intended, and in the case of zipFields I don’t even know what you might have intended, if you aren’t going to type everything as String. The code snippet that zipFields produces might have some type if interpreted as a stand-alone Haskell expression, but as used in this example it can’t be a stand-alone Haskell expression; it’s a sequence of arguments applied to Point. If your proposed macro system tried to assign a type to that thing—any type—it would be wrong. Unless, as I said, you give up on types entirely and call it a String.

Also, you can get quite close to what you wrote using existing Template Haskell and without engaging at all with any of the AST types other than knowing that the type of a quoted expression is Q Exp: try in the Haskell Playground. TH quotations are very convenient and I think you’d have a hard time demonstrating that untyped strings are so much better than TH quotations to justify a whole new macro system.

To answer your title line (I’ve put in several proposals, even got one accepted – which has gone nowhere): be a member of the Steering Committee. I’ve observed nobody else has a hope of getting a proposal actually all the way through to release.

Come on, this is a misrepresentation.

Scanning the list of proposals marked ‘implemented’, of the first page (25 PRs), 11 were created by users who are not listed as current or former members of the Steering Committee.

6 Likes

Nah, just be willing to implement your proposals once they’re accepted.

3 Likes

Goop point. I was rushing yesterday and made a mistake (I started writing zipPoint then refactored it to zipFields forgot to update the type signature). The type of zipFields is String -> [String] -> String and so can be interpreted as stand-alone.

GHC SC is actually pretty lenient and they accept proposals even if there’s not a full implementation.

You’ll have less luck with base proposals, where CLC requires a full implementation with impact analysis upfront.

1 Like

Yes, it can. This perhaps is beside the point of your example but I took the opportunity to write it using my library, product-profunctors, which is a sort of generics library: addPoint product-profunctors · GitHub

1 Like

Ah, apologies, I thought I’d seen you saying something about this before and I jumped to conclusions :slight_smile:

1 Like

As you mention in your draft proposal, unsafePerformIO still exists. FFI is often another form of hidden IO, and is also very difficult to avoid.

I still think it’s worthwhile. The various unsafe functions let us tell people “unpredictable things may happen if you use these”. So for example, we might say “if you use unsafePerformIO inside a pure splice then the working directory may be unpredictable”. Or “pure splices will not be rerun pessimistically, so your IO actions may get stale”. So even if we can’t truly stop people from doing it, we can (I think reasonably) start assuming they do not and offering weaker guarantees if they do.

To be clear, there are plenty of legitimate usecases for IO in TH, so I don’t want to ban impure splices. I just want them to be special and clearly marked.

3 Likes

I could use TH but not without all of its drawbacks (I already listed some in my post).
TH would not allow me to write zipField and use it in the same file. I would like to be able to do that.
TH splits your files into sections forcing you to write things in a certain order. I would like to be able to get rid off that.
If we could modify TH somehow to be able sort those two things, then I agree with you, we don’t need a new macro system. Maybe “untyped string” is not the best solution and I’m open to anything.

It is when you can use it, but it stops being useful when you are working with a unknown number of element. Let’s go back to zipField, which take a list of arguments (just not two). So I could do

data Point  = Point {x,y :: Double}
data Point3 = Point3 {x,y,z :: double}

addPoint = Point $((zipField ["x", "y"])
addPoint3 = Point3 $((zipField ["x", "y", "z")

How would you write the equivalent in TH using TH Quotation ? (I’ve done similar things and the code end up being something like foldl AppE (VarE $ mkName fnFame) fs (which looking at it now, I have no idea what it does).

What I want to create a HKD like
works

data PointF f = PointF { x :: f Double
                                    , y :: f Double
                                    }
data Point3F f ...

With “untyped string macro” in could write (in a single file), something along

$((
makeF :: String ->  String -> String
pointF cons fields = 
     "data $cons f = $cons  {"
    ++ intercalate "\n,    " [ "$field :: f Double | field <- words fields ]
    ++  "\n }"
makeF "Point" "x y"
makeF "Point "x y z"
))

It tooks me 2mn to write. How would be the equivalent in TH ?

makeF cons field s=
    DataD [] (mkName cons) ? ?? ? ( foldl AppT ???) []

Every time something is added to GHC, DataD is getting more arguments which I need to fill in with [] or Nothing etc …

Probably but I think TH has show is limits which is why since TH has been introduced we got type families, generics (which wouldn’t have been needed if TH has been easy enough to use).
The last proof of limitation of TH is the multiline proposal.
We already have one native way to do multilines, (using \), then there are plenty of Template Haskell nice solutions to it, with or without string interpolation, with or without leading spaces stripping etc …
If TH was good enough, everybody would be using a TH solution to this problem, but nobody is, thus the need for multilines syntax.

I’m not sure about the specifics but IIUC typed TH is better in this regard. See #12457: Deriving should be (more closely) integrated with other metaprogramming methods · Issues · Glasgow Haskell Compiler / GHC · GitLab and #21051: Stage restriction differs between untyped and typed TH · Issues · Glasgow Haskell Compiler / GHC · GitLab