BlockArgument and $

Well this is production code… :laughing:

1 Like

Seriously, the advantage of using $ and do over using parentheses is that reading the code is easier. You do not need an internal stack of open parentheses. "do resets the scope", one could maybe say.
E.g. in

 not <$> do Lens.isBuiltinModuleWithSafePostulates . filePath =<< getCurrentPath

we can immediately see that not is applied to everything following on this line.
In contrast,

not <$> (Lens.isBuiltinModuleWithSafePostulates . filePath =<< getCurrentPath)

requires us to search for the closing parenthesis, after which there could potentially be more stuff.

The same argument justifies the syntax \x -> some expression over the syntax \x(some expression) which would be the lambda-calculus notation before the lambda-calculus “dot” (\x. some expression) was introduced. The arrow (like do) resets the scope, everything following is part of the lambda body. In \x(some ... you would need to search for the closing parenthesis, e.g \x(some) expression would mean the redex (\ x -> some) expression.

I’m all for no parentheses but mostly for lazyness : I don’t mind opening them, but I don’t like closing them (especially when modifying some code).
Ultimately, I’m all in for layout based approach.

The question with do is, at which point does it becomes readable ? Until now, In my (our) mind it starts a Monad instead of only starting a layout.

That just shuffling the difficulty that wouldn’t be there if you stop mixing IO and pure:

current <- getCurrentPath
pure $ Lens.isBuiltinModuleWithSafePostulates (filePath current)

Literally nothing to track and reset.

sure, every parenthesis can be replaced by let bindings too.

…and having names for local definitions makes it much easier to change how they are used later on e.g. if the order of arguments to a function needs to be rearranged.

Are you familiar with Enso? It has this feature. https://enso.org/

Enso does seem to have interesting layout rules, but it would be better to link directly to the relevant documentation:

https://enso.org/docs/developer/enso/syntax/layout.html

3 Likes

And Defining Functions · Enso Developer Documentation!

It has (almost) all the things I have been talking about! Guess I don’t have to write my own language after all, and do something useful instead.

Please no - significant whitespace is a mistake IMO (at least as long as textual communication media and text-base source control systems are a thing), layout is bad enough as it is, I certainly wouldn’t want more of it, and if we absolutely must have it, then at least it should remain contained within small blocks.

I think whitespace sensitivity is very useful and it seems most people agree. I’ve never seen anybody write their modules like this:

module Main where {
  fib :: Int -> Int;
  fib 0 = 0;
  fib 1 = 1;
  fib n = fib (n - 1) + fib (n - 2);
 
  main :: IO ();
  main = print (fib 10);
}
4 Likes

:100: Sprinkling semicolons and other syntactic noise everywhere is pleasing the lexer. Very annoying when languages optimize for being easy to parse instead of being ergonomic.

1 Like

…yeah, because most implementors enjoy languages with “interesting” syntax:

  • An Implementation of the Haskell Language (1990)

  • Implementing Haskell: Language Implementation as a Tool Building Exercise (1993)

Even simple mathematics can have its challenges:

As for being ergonomic…how about just using natural/spoken-language like humans use?

Not mathematics, but calculators. The problem is that you can’t use fraction notation on calculators, so you have to use the division sign ÷ instead, but that does not have a well defined precedence especially when combined with juxtaposition for multiplication. Again, this is a case where ease of implementation was prioritized over ease of use.

But “ease of use” cannot always take priority over “ease of implementation”, otherwise implementations risk ending up being impossibly complicated - the complexity of making things “easy to use” has to go somewhere

1 Like

You are right, however “easy to use” also involved human being able to understand the syntax of the language. So in a way, easy to implement goes in pair with easy to understand.

Before you go back to your “natural language” example. Natural language is not easy to use, it conveys lots of ambiguities (which is inacceptable for a programming language) and can take years to master (I’m french, we spend a few hours a week from 6 to 15 years old to learn the french “natural language”. Yet most french are rubbish as spelling (me included), we even have french spelling competition on TV).

1 Like

While I agree on implementation hurdles becoming problematic downstream, we (usually) don’t design languages for the sake of being implemented.

Was it not the selling point of LISP ;-)?

I do understand the argument for whitespace sensitivity - it does indeed reduce visual clutter, I just don’t think it’s worth the problems.

Even if we ignore the challenges of parsing such a language, there are still some rather annoying problems, and those problems are in the realm of ergonomics:

Editor issues

Code is edited far more often than initially written; we rarely get it right on the first attempt, and even when we do, the requirements change more often than not. Hence, it is important to optimize code for editing, and one of the most common editing operations is moving chunks of code around. In a non-whitespace-sensitive language, this is generally unambiguous: you select the code you want to move, cut, you place the cursor in the position where you want to paste it, paste, and then let the formatter take care of aligning it correctly. And because the insert position unambiguously defines where the code belongs in the syntax tree, the formatter can always get it right. But in a whitespace-sensitive language, pasting at a block boundary is ambiguous - it’s often impossible to tell whether you meant to paste at the end of the block above, or the beginning of the block below, so additional manual effort is needed to make sure the editor automation does the right thing. It’s not the end of the world, but it’s one more ergonomic papercut.

Another papercut is that automating “jump to beginning / end of block” is difficult. Most editors have built-in features for jumping to matching braces, parentheses, brackets, etc., some can even select, cut, delete, etc., such blocks with just one or two commands (e.g. vim’s a{ / i{ movements), but with whitespace-defined blocks, this can’t easily be done in a language-agnostic fashion. Similar issues exist with folding.

Oh, and of course these two papercuts stack up: I have to manually re-indent my code a lot, and when I want to change the indentation of a block, selecting that block also requires manual effort, I can’t just tell the editor to “increase indentation for this block and all nested blocks inside it by 1 level”, I have to manually find the end of the block and then say “increase indentation for the current selection by 1 level”.

Readability issues

Indentation is a powerful tool for signalling intended structure; but in a whitespace-sensitive language, we can only use it within the limitations of the syntactic relevance of that whitespace, and we have to be mindful of those rules.

I’ll start with an example from JavaScript, which is almost whitespace-insensitive, but has this gnarly thing called “semicolon insertion”, which makes newlines syntactically significant, but only sometimes. Quick, where’s the bug?

return
    getRavenousness(bugblatterBeastOfTraal) > 1;

(And, side note, this also feeds into the “communication issues” discussed below).

A classic example from Haskell would be:

main = do
    let greet name = do
        putStrLn $ "Hello, " ++ name ++ "!"

These things turn reading code into a puzzle game, and while I enjoy a good puzzle, I don’t think this is the right place for them.

Diff issues

As it stands, source control is still, and likely will be for the foreseeable future, text-based (or at least, I don’t know of any SCM that does syntax-aware tree diffs or tracks actual editing operations), and largely operates at source line granularity. Without explicit block delimiters, some operations that are deemed “safe” are, in fact, not - e.g., consider this code:

main = do
    -- Greet
    putStrLn "Hello, world!"
    -- Make smalltalk
    putStrLn "How are things?"
    -- And we're done.

Now imagine two developers independently working on this code; developer A removes the “Hello, world!” line, developer B removes the “How are things?” line. Both end up with a valid Haskell program

main = do
    -- Greet    
    putStrLn "How are things?"
    -- And we're done.

and

main = do
    -- Greet
    putStrLn "Hello, world!"
    -- And we're done.

And now we merge those edits; they are compatible, the SCM can figure out how to auto-merge them, or at least it thinks it can, but the result is malformed Haskell:

  main = do
    -- And we're done.

Communication issues

The above-mentioned editor issues get worse when the “editor” is not the code editor we carefully tuned to our coding needs, but a textarea in a website or IM - those things are never optimized for code, let alone code written in a specific language, but we still need to use them to edit code. And they usually apply their own text mangling, e.g. when quoting messages, or even just reformatting things to fit a different presentation. Just look at the Python subreddit - it is possible to put whitespace-sensitive code into Reddit, Twitter, etc., in such a way that it doesn’t come out mangled, but it requires more diligence than most people are willing to put into it, and even those who know how to do it in principle will get it wrong when writing casual posts. E-mail also causes similar problems; while you can control the editor, you cannot control the mangling that happens on the other end, and if a mailing list or similar is involved, then that adds more mangling enroute.

In a whitespace-insensitive language, this isn’t a huge problem - all the block delimiter tokens are still there, so you can use automated tools to quickly de-mangle the code in an unambiguous manner. int main(int argc, const char *argv[]){for(int i=0; i<100;i++) printf("%i bottles of beer on the wall\n"; return 0;} looks like word barf, but demangling it is completely unambiguous. Compare that to the equivalent Haskell: main = do let go n | n < 0 = return () go n = putStrLn $ show n ++ " bottles of beer on the wall" n0 = 100 go n0 - you can stare it down and, as a human, figure out where the line breaks need to go in order for the program to make sense, but it’s too ambiguous for a computer to reliably do it for you. You could write a program that finds candidate spots for newlines and indentation, and then just tries all candidate solutions in order until it finds one that compiles, but, ugh.


Again, these are papercuts, and I’ll happily live with them in Haskell, because the language makes up for it quite massively - but I do consider them one of the less nice aspects of Haskell.

Oh, and about the “but layout is optional” argument: yes, it is, technically, but writing Haskell with explicit braces and semicolons is uncommon enough to be considered unidiomatic, so if you care at all about cooperating with other Haskell developers, it’s basically a no-go.

So that’s where I’m coming from - it’s part of the language, and I’ll deal with it, but please please please don’t add more of it.

5 Likes

Fair enough. I think it would be possible to write a formatter that can recover this type of error, but that is indeed not trivial.

GitHub - jeetsukumaran/vim-indentwise: A Vim plugin for indent-level based motion.

It isn’t guaranteed to get you exactly to the start of each block, but if you are disciplined in using a compatible code style then you can get pretty close.

Or just run your formatter.

You can also make your code quite unreadable by placing braces and semicolons in unexpected places.

I just don’t see how braces and semicolons would fix this. Wouldn’t you get:

main = do {
    -- Greet
    putStrLn "Hello, world!";
    -- Make smalltalk
    putStrLn "How are things?";
    -- And we're done.
}
main = do {
    -- Make smalltalk
    putStrLn "How are things?";
    -- And we're done.
}
main = do {
    -- Greet
    putStrLn "Hello, world!";
    -- And we're done.
}
main = do {
    -- And we're done.
}

The bugs in Reddit’s markdown is one of the reason I’ve quit. I hope they make more anti-user changes soon so that it really burns down completely and we can finally move on to more respectful platforms (e.g. kbin).

As far as I’m aware my plain text mails are usually rendered correctly.

I actually think this would not be that difficult with some heuristics and a generalized parser which supports ambiguities. But I agree this requires some more research/engineering.


All in all, I still think the readability benefits are worth the small issues, especially because braces and semicolons also have their own issues.

2 Likes