Thank you for the nice blog post!
I like most of the suggestions, it feels like a step towards elm, but without taking away the flexibility of Haskell.
In my opinion, elm’s syntax is an improvement over Haskell’s, but it prioritised beginner friendliness a little bit too much.
Also, cleaning up records and module imports is a huge improvement, hopefully we can improve these two things in Haskell as well.
The only small thing I would like to mention is about the controversial things (as you expected ):
Tree a =
| Nil
| Node { value : a, left : Tree a, right : Tree a }
I think this example contradicts the point:
- We can no longer mix records with sum types and get partial accessors or updaters.
Additionally, I feel like type
and data
are very useful to declare the intent of the data type.
For example, elm has type
and type alias
to distinguish sum and product types.
This improves error messages, as the compiler can tell you that your intent and actual implementation are in disagreement, whereas the tree example is less clear why it is wrong.
The example implicitly mixes sum and product types, so a potential error message should explain why it is wrong and suggest ways to fix it.
However, in this case, should the error message suggest you remove the accessors of Node
, or should Nil
be removed, etc…
I like lists like these that highlight parts of Haskell that can be improved.
However, I think we should really just start identifying what the problems are. For example I don’t understand point 9 (the one between 7 and 8!?) about docstrings at all. We have Haddock, what’s wrong with it?
I also we should consider why things are as they are now. For example point 8 about debugging suggests we should have each type derive a Debug
class which (presumably) would allow us to call a special pretty printing function on any value including those that have a polymorphic type. But that requires keeping around a ton of extra information at run time which will slow down programs. Is that really a tradeoff we’re willing to make?
Sorry I’m not sure I understand. What I have issue with is being able to define the same data type in has and then being able to write let x = Nil in x { value = 1 }
and value Nil
. With first class records that wouldn’t happen.
For example, elm has
type
andtype alias
to distinguish sum and product types.
type alias
in elm is the same as type
in Haskell, and type
in elm is the same as data
in Haskell. I don’t think they distinguish sum and product types?
Regarding Debug, we do similar things for Show
already, and this doesn’t have to happen for release builds. I’m sure this is a solveable problem
Regarding docstrings, there’s a long thread about it in haddocks issue tracker already: Support Markdown syntax via `commonmark-hs` · Issue #794 · haskell/haddock · GitHub
Sorry about the numbers mixup, too much editing . Thanks!
With first class records that wouldn’t happen.
I don’t quite understand that, given the Tree example above, and a :: Tree
, why would value a
not work? Would it only work if you previously made sure that a
is indeed a Node
?
I don’t think they distinguish sum and product types?
Indeed, I mixed up some terms. Disregard then.
Since you seem to have a preference for its syntax, it would probably be easier to just adapt an existing Standard ML front-section for that purpose - the two main changes I can think of right now being the switch to non-strict semantics by default, and limiting references to imperative contexts.
The old Chalmers Lazy ML implementation is also still available.
remove […] partial functions like
head
[…]
…and (/)
, div
, mod
, divMod
, quot
, rem
, quotRem
, et al!
Provide a better API around I/O than lazy I/O.
…meanwhile, in Prolog:
In Haskell, when you define
data Tree a
= Nil
| Node { value :: a, left :: Tree a, right :: Tree a }
Haskell generates a new functions for you, like value :: Tree a -> a
.
This is not the case in a language like PureScript where records are first class and .value
has the type forall r a. { value :: a | r } -> a
and works directly on the payload type of Node
and not on Tree a
. So you’d have to pattern matching on something of value Tree a
, and on the Node payload
branch you can do payload.value
.
tl;dr I think the records problem is basically solved, and we just need to encourage turning on the right extensions.
Not with NoFieldSelectors
, which I hope will become part of GHC202*
.
This is really not that different to what I’d do in Haskell:
f = \case
Nil -> _
Node{value} -> _
GHC actually warns on this, albeit in a slightly unintuitive way.
My only issues with records in GHC now are really that dot access is partial, and update isn’t overloaded, and these are both being worked on.
I don’t think |>
and <|
are good ideas for Haskell, because you cut yourself off from a lot of historical code, as well as from a consistent visual language which has grown up around those operators.
OP:
Records are anonymous and are generated as data types with a canonical representation.
and
I don’t think the “records problem” is totally solved in Haskell, and I also don’t quite understand how the blog post suggests adding anonymous records to Haskell via a new frontend. I would love for Haskell to have ergonomic anonymous records. To avoid any confusion, this is the sort of code I want to be possible in Haskell:
moreEvenThanOdd :: Array Int -> Boolean
moreEvenThanOdd xs = counts.evens >= counts.odds
where
counts = foldr updateCounts {odds: 0, evens: 0} xs
updateCounts i acc = if isEven i
then acc {evens = acc.evens + 1}
else acc {odds = acc.odds + 1}
(This is working Purescript code.)
Note that I don’t need to declare the record type anywhere, this code just works standalone. This is what I understand by anonymous (unnamed, undeclared) records.
…or perhaps:
moreEvenThanOdd :: Array Int -> Boolean
moreEvenThanOdd xs = counts.evens >= counts.odds
where
counts = foldr updateCounts {odds = 0, evens = 0} xs
updateCounts i acc = if isEven i
then acc {evens = acc.evens + 1}
else acc {odds = acc.odds + 1}
But this is getting somewhat off-topic: it’s about an alternative (replacement?) front-section, not augmenting the existing one (presumably in GHC) e.g. one example being:
GitHub - rusimody/gofer (featuring “Dijkstra dot” notation)
…but for Haskell instead of Gofer.
But if you do want to add to a list of things Haskell should (or shouldn’t) have:
The idea in my head is that your code will be translate to something like this:
data RecordEvensOdds
= RecordEvensOdds
{ _recordEvensOddsEvens :: Int
, _recordEvensOddsOdds :: Int
}
makeFields ''RecordEvensOdds
moreEvenThanOdd :: A.Array Int -> Bool
moreEvenThanOdd xs = view evens counts >= view odds counts
where
counts = A.foldr updateCounts RecordEvensOdds{_recordEvensOddsEvens = 0, _recordEvensOddsOdds = 0} xs
updateCounts i acc = if isEven i
then over evens (+1) acc
else over odds (+1) acc
But tbh this is a fairly raw idea, I haven’t thought too much about the possible kinks of this approach
Yes, for this example that can work. But I think for general code, to work out which anonymous record types have to be secretely declared would require quite a bit of type-inference, which will probably have to then be aware of nearly all of GHCs type inference, at which point it starts to sound hopeless. But I might be wrong.
You’re right, there is some type inference involved here to figure out which type a record should have during initialization.
It might be possible to convert record construction to constructing tuples first, extract the types, and then construct the data type. Or there might be some GHC api that lets you fetch the type of a partial expression. Or we can make the types polymorphic instead of concrete.
There are probably a few angles to try before it becomes hopeless
Partial field selectors is indeed a problem, however they are really handy when updating and/or in conjonction with RecordWildCards.
For example you migh want to write (or equivalent)
incTree Nil = Nil
incTree Node{..} = Node{value=value+1,..}
Which you can not do without fielf selector.
What I propose is either
- don’t generate the field selector function for field starting with
_
(but keep the update syntax). It is in a way the same asNoFieldSelector
but only for the chosen fields. - Generate a
Maybe
version of the field selector function if it is partial, but keep the update fields . In our case we would havevalue :: Tree a -> Maybe a
.
The other thing I find missing in Haskell is indeed a real macro system (without stage restriction and which doesn’t break every time GHC internals are changed) and a way to annotate things in declaration (similar to annotations Persistent entity syntax. Those annotations could be then used in TH or Generic.
This could be used, for example, to specify a json name different from the field name.
Although both Herb Sutter’s talk and the blog post suggest an “alternative syntax” (both attempts of which I can sympathise with), it’s not just a new syntax. It’s also name resolution on that syntax (otherwise how would Cpp2 work without forward declarations? Or how would the changes to Haskell records work?), e.g., static semantics.
My hypothesis is that any such “alternative syntax” is doomed to fail as soon as “enhancements” to the (complicated, in both cases) type system come into play. You might think that it’s “enough” to be (think feature-) complete wrt. the target type system. E.g., only reject programs that you are sure that should be rejected (Sutter’s bounds checks, inout
, etc.).
That’s good enough for a first prototype, but ultimately people expect good type errors on their original program and hence your model of the type system must be sound as well, e.g., you have to do type checking on the alternative syntax. You absolutely don’t want to replicate the type system of C++ or GHC Haskell!
That’s why I’m a bit hesitant about ideas of first-class records and row polymorphism (both a type system extension), although these paths have been trodden before. (The PersonRecord
desugaring is insufficient for anonymous records, for that matter.) The problem is the interaction with the rest of the type system.
I guess this is quite like the challenge faced by macro systems, which is why most of them don’t support major extensions to the type system. And for macro systems, mapping back error conditions to particular syntax constructs is a major challenge as well.
By the way, in my opinion Haskell is not a research language anymore (in the sense that the language itself is the subject of research). Adding Dependent Types to Haskell is not research, it’s engineering. The research was Richard Eisenberg’s thesis, the rest is just application of that and engineering it into a coherent language. Sure, it’s a huge project and worth a prominent academic publication to summarise the lessons learned (so that others might learn them faster), but it won’t steer current trends in programming language research (and by that I mean the restricted notion of research on the frontend of a language, which for many Haskellers is the only relevant notion of research). The latter happens on smaller languages with smaller compilers and less production constraints.
So the whole matter of “which language extensions should be standard” is not one of academics vs. industry users; it’s one of enthusiast users vs. industry users, both of which are the result of personal preferences justifiable by different metrics.
Fair points . I assume this project is doomed to fail for sociological reasons before we even get to the technological parts, but I do agree that this kind of project shouldn’t try to bite more than it can chew. Still, I think it’s worth trying to figure out records before bailing out, and I’m not sure you need to extend the type system for row polymorphism since you can do that with lenses.
Regarding extensions and enthusiasts vs industry users, yes, different groups would prefer different features. What’s in the blog post is my opinion on what would make the best bang-for-buck for a language without customization knobs, but I understand that some people see things differently.
In a language with first class records like PureScript, field selector like the ones in Haskell are irrelevant because they don’t work on the data type Tree
, they work on the payload of Node
. So your snippet becomes:
incTree = case _ of
Nil -> Nil
Node r ->
Node (r { value = r.value + 1 })
Btw you can still do the thing you wanted to do in Haskell with similar convenience and without field selectors by using aliasing:
incTree = \case
Nil -> Nil
n@Node{value} -> n { value = value + 1 }
Ok, I misunderstood your initial proposal. Basically, {value :: a}
is a type on its own and Node {vale ::a}
is not a record but a Node
with a {}
attribute. We could have (if we wanted ) Node {value::a} {x:: Int, y:: Int}
.