Strongly agreed, as those who saw my answer to Jose Valim’s challenge will probably guess.
…along with all the other “linglyphics” associated with effect-centric code in Haskell e.g. from Applicative
, et al.
Agreed: one person’s moment of cuteness or cleverness can often look like “write-only code” to others - we cannot assume that everyone who uses Haskell these days have the years or decades of experience that those of us who were there “in the early days” have now accumulated.
So as ugly as Haskell’s “two-tone” syntax (for regular and imperative Haskell) is, for now using do
-notation is the least ugliest (it lessens the need for those linglyphics).
Another cost of mixed styles is more frequent “switching” between:
-
“expression-style” contexts, (which use
(>>=)
,(>>)
and the rest), -
and “statement-style” ones (using
do
and<-
);
…in order to understand the resulting code - not only is this an acquired skill for most, but the need to remember which context is being used is still an expense to be avoided, no matter what the magnitude of Haskell experience.
But this should not be thought of as an endorsement by me of do
-notation: if I could go back to 1996, we would all be writing something more like:
program :: IO ()
program = do () | let a = 3,
let b = 4,
print (a + b)
i.e. do
-blocks would have the syntax of list comprehensions, but without the brackets (those being reserved only for actual list comprehensions).
To specify, I think this is an important problem because for a lot of applications (but not all, or even most), Haskell naturally ends up with a lot of monadic code and relatively little pure code (not imperative shell, functional core, but imperative bones, functional flesh).
When it comes to #1, the problem comes down to simply not being good enough.
Yes, Haskell has a notion of effect, but the way I describe do-notation is as “somewhat more verbose Python”, and even though Python is widely regarded as the most readable of all imperative languages, I’d expect that Haskell can do better.
That is why I’m interested in #2, #3, and #4; i.e, if a lot of Haskell code is going to be monadic, it’d better be a very pleasant monadic experience, otherwise the only thing you have to offer is having an explicit notion of effects.
As far as going for #2, #3, and #4, over #1, yes, I’m aware both of Do Notation Considered Harmful, as well as, I believe, Chris Allen mentioning that many Haskell learners go through a “**** do notation” phase.
And I agree with the last statement; i.e, avoiding do notation entirely is a phase people go through on their Haskell journey (I’m at least a year past this), but being purist on “imitation Cobol/Python”, is something I have to question too.
Sorry if I missed it, but I don’t think you explained this. You said 1 is “a waste” and “simply not good enough” but I don’t understand why you think that.
When it comes to #1, the problem comes down to simply not being good enough.
It absolutely is good enough. Results are always produced on the left and every line describes a single step. It can be split or combined to the level of granularity wanted and it’s pleasantly uniform.
Using anything else to spice up the control flow immediately kills this simplicity and the resulting code is guaranteed to be harder to both read and alter.
the way I describe do-notation is as “somewhat more verbose Python”
Are you sure this is on the do-notation and not on the fact that virtually everything you do in Haskell is more verbose? You’ll need a mountain of sugar to compete in shortness with that.
Frankly, “do notation considered harmful” is at best misleading and at worst completely wrong. Again, there is nothing more imperative about do notation than about any other use of Monads.
If you don’t want to use monads, that’s fine I guess but don’t fool yourself into thinking that writing out obtuse operators for simple monadic code makes you any more “pure” than others.
Well, yes, that is exactly what monads are. (>>=)
is literally the monadic equivalent of a let binding.
#1 is a strawman, and I didn’t think this was going to be the approach everyone preferred. If I did, I’d have spent more time writing it.
The reason I think #1 is a waste, is because of the superfluous bindings, when it could have been written simply as print (3 + 4) or print 7.
Of course, making intermediate steps explicit is always useful when you’re dealing with a long function chain, but in the example given, it’s simply not useful.
As far as being “not good enough”, Haskell has a reputation for extremely elegant code. I see this quite often in the pure space, where the algorithm is extremely explicit and boilerplate-free, thanks to higher-order functions, recursion, and function composition.
When you get to do-notation, on the other hand, it’s hard to make the same apply.
At others:
The point is more, I’m trying to explore what is a strong way to write monadic code in such a way that readability and concision, if not necessarily accessibility, is better than Python.
I think Haskell can actually do this, but not via an apples-to-apples comparison.
The trick comes because in Haskell, it is, at least in some corners of the community, idiomatic to use >>=, <$>, liftA2 / <*>, before a do-notation bind (bar ← foo).
This is equivalent to using function pipelines or method chaining in more traditional languages, and that’s either controversial or smelly in traditional / imperative programming.
The whole monadic/applicative/factorial smorgasbord makes explicit what you’re doing, and even if it’s less concise or ends with weird operators, the explicitness cuts away the smell.
Here’s two programs, translated back and forth between Python and Haskell, which do the same thing. No, the exception control is horrible, but these are just samples showing what’s possible.
def main():
file1_file_path = input()
file2_file_path = input()
#
file1 = open(file1_file_path).readlines()
file2 = open(file2_file_path).readlines()
#
if file1 == file2:
print('Success!')
else:
print('Failure!')
import System.IO (readFile')
main :: IO ()
main = do
file1FilePath <- getLine
file2FilePath <- getLine
file1 <- readFile' file1FilePath
file2 <- readFile' file2FilePath
if file1 == file2
then putStrLn "Success!"
else putStrLn "Failure!"
Okay, so that’s version one. Let’s see the same semantics, but ordered a bit differently:
import System.IO (readFile')
import Data.Bool (bool)
main :: IO ()
main = liftA2 (==) (getLine >>= readFile') (getLine >>= readFile')
>>= putStrLn . bool "Failure!" "Success!"
def main():
print('success' if open(input()).readlines() == open(input()).readlines() else 'failure')
Okay, so in both versions, the Python version wins out in terms of concision. But when you’re looking at the 2nd Haskell version, it’s more concise than the 1st Python version, I’d say it’s more readable, and it’s idiomatic, at least among some Haskellers.
The 2nd Python version, on the other hand, I’d think is generally smelly because it doesn’t clearly indicate an order of effect, and will get you screamed at by the Python style purity brigade.
Wait for that angry code review!
Generally speaking I think with Haskell you can make good use of horizontal space, but let us not forget code is written once and read many times. I prefer version #1, what is happening is immediately clear you can sprinkle traceM
later if needed, etc.
The more the function gets larger, the more I prefer unambiguous, simple code. if it gets too long, you can always move stuff to where
; adding signatures helps too!
Unfortunately, that’s usually the case with most larger forms of syntactic sugar: they don’t provide a way to use certain “granules” as that would lead to syntax errors. An example is the use of <-
, which some have thought was an Haskell operator that can somehow extract the result of any monadic expression (it isn’t and it doesn’t).
Considering that:
I think we can assume that the inspiration for Python’s procedural syntax was in large part the functional syntax in Miranda(R) and Haskell - if I understand you correctly, you are now attempting to find a way to make Haskell’s procedural (do
-notation) syntax as neat as Python’s. If so, then by way of “syntactic transitivity”, it seems you’re wanting a more functional syntax for monadic expressions, beyond either do
-notation or regular Haskell expression/function syntax involving the various monadic operators - you’re taking on quite a challenge there!
Before you proceed any further:
To add:
In summary:
-
at the time,
do
-notation was considered to be the most-palatable form of syntactic sugar. -
while it started out as the work of a few individuals, it entered standard Haskell by way of group approval.
Well that is an interesting example because I absolutely disagree that the second Haskell version is more readable.
It just unnecessarily obscures the order of effects and I don’t think I’m in the minority with this opinion. In fact, you got it wrong yourself! ^^
These two versions aren’t actually quite equivalent.
The order of effects in the first one is (obviously because you can clearly read it off the order of statements) getLine -> getLine -> readFile' -> readFile'
.
But your second version actually performs effects in the order getLine -> readFile' -> getLine -> readFile'
, which manifests itself if the first file is not found.
I also find the claim that this is “idiomatic” somewhat objectionable. Just as one data point from someone who definitely knows what she’s doing, the pipes tutorial (which is amazing btw) only uses the first stlye.
From this example:
…it seems you prefer the “point-free” style - at the time, so did John Backus:
Can Programming Be Liberated from the von Neumann Style? … (1978)
Here’s another point I’ve gotten wrong. The Haskell version, if you ignore the usual “10 lines of extensions, 25 lines of imports” stuff Haskell code often incorporates (and here to a very tiny degree), is actually very slightly smaller than the Python version.
To clarify, I prefer restricted point-free, i.e, I think point-free can easily become unreadable when the chain gets too long, but when used in moderation it’s beneficial.
If, say, the problem is “get two file names on the input prompt, try to compare the files represented, and return success if they’re the same and failure if not”, my actual preference would be:
import System.IO (readFile')
main :: IO ()
main = do
file1 <- getLine >>= readFile'
file2 <- getLine >>= readFile'
putStrLn
$ if file1 == file2
then "Success"
else "Failure"
This is a blend of the readability of #1 and the concision of #2.
As a limit of where I think point-free starts falling apart:
apiGet :: IO ()
apiGet = do
res <- getChar
>>= parseRequest . \case
'1' -> urls !! 0
___ -> urls !! 1
>>= httpLBS
traverse_ putStrLn $ responseBodyBlock res
where
urls =
[ "https://jsonplaceholder.typicode.com/todos/1"
, "https://api.example.com/data"
]
responseBodyBlock resource =
[ "Response body:"
, unpack $ getResponseBody resource
, "OK1"
]
Just a snippet, a refactor of someone else’s code. Someone suggested me to take out do entirely, and turn it into a purely monadic pipeline.
But for me, it’s obvious that, even if I’m point-freeing to avoid some unnecessary bindings, the “get data” and “print data” are two separate steps, and just directly pipelining the traverse would make it significantly harder for me to read.
If this were to be turned into a pure pipeline, it’d follow the #4 model, where the steps are clearly delineated on the top-level, then have their implementation specified in the where clause.
And I’m already uncomfortable with the pipeline; it is at the maximum level I can tolerate. Anything more, I’d stuff a name on it and push it down to the where clause.
To summarize, then, I like, but do not push for, a semi-point-free style often mediated by do-notation.
I think I now understand your perspective on this matter: Haskell is labelled as being a functional language, but more often or not you find yourself looking at vast expanses of rather-procedural code.
The first nonstrict functional language I learned about was Miranda(R). So the transition to Haskell, and the already-widespread use of do
-notation, was a slow one (it being more the result of necessity rather than preference). But I remained dissatisfied with Haskell’s imperative style for working with effects - over the last twenty years or so, I’ve learned to tolerate it, but not accept it (
if many of the quotes, articles and references I’ve placed here and elsewhere didn’t already make that obvious ;-)
.
This remark:
…seems “about right” on this matter: these days, it seems most new Haskellers are far more interested in just “getting up and going”, which that simple “ultra-imperative” style allows them to do
more quickly, particularly if they have prior exposure to imperativity. So for now, we need to write code that they can easily understand, even if we aren’t entirely satisfied with it.
Yeah, but the point of this thread was to ask a question about how people style their monadic code, i.e, what kinds of best practices people had arrived at.
I’m just really disappointed that the most common answer was #1, that is to say, people seem to be seeking an imperativeness which is greater than that of Javascript, Python, F#, Elixir, and OCaml.
In the two mainstream languages, mild use of method chaining has become mainstream, and method chaining is somewhat similar to direct monadic bind in terms of style.
In the impure functional languages, pipeline operators are both used and abused, and there’s effectively experience around when pipelining is bad, as opposed to just trying to ban direct monadic bind altogether.
Imperative Haskell code isn’t the same as imperative JS, Rust, Go, etc. The fact that it’s first class and has Monad etc instances means you can metaprogram Haskell imperative code very nicely. That’s where it shines imo.
That’s where it shines […]
…like the proverbial “gilded cage” where I/O is concerned:
I also think this blended or hybrid approach is preferable. To me, semi-point-free code absolutely can improve readability because it emphasizes certain concepts by assigning them names.
When you have a little pipeline whose intermediate state should be irrelevant to the context, assigning a name to that state is clutter. Extracting the little pipeline to a named top-level form is the alternative, but now you need to name that pipeline and that might be more confusing for a reader than letting them just see the familiar pieces it is composed from.
I suspect that if you’re going hybrid, a better way might be to do it via manual liftA2:
import System.IO (readFile')
main :: IO ()
main = (==)
<*> getFile
<*> getFile
>>= putStrLn . bool "Failure!" "Success!"
where
getFile = getLine >>= readFile'
The usage of fmap or <$> makes the code much more amenable to verticalization, which helps avoid excessively horizontal code.
Interestingly enough, on IRC I was complaining about how liftA2 and <*> can’t guarantee an order of effect, but it turns out that the 5th monad law (which means that ap = = <*>) guarantees an order of effect if the Monad instance exists for the Applicative type.
The point of blended / hybrid / where style is to try to separate specification from implementation, to enable a fast perusal of code without caring for the implementation details unless something seems wonky. You can still name monadic actions, provided that they’re not the last line, by simply giving them a dummy bind (nameOfAction <- myActionChain
), which, unfortunately, will make the compiler complain as the nameOfAction is unused.
You can even use this technique to name blocks of code, i.e, nameOfAction <- do...
which can be more ergonomic than straight commenting.
Two thoughts here.
First: the benefit of purity in Haskell is not about making it look pure - it is about making it easier to reason about the code, and whether you do or do not get this benefit only superficially depends on the syntax you choose to use for your monadic code. The key thing about pure code is the absence of (side) effects; and depending on how you look at it, monadic code is either effectful, or it’s not, but massaging it into a different surface syntax does not change that. getLine >>= putStrLn
has exactly the same effects as do { x <- getLine; putStrLn x }
, there is literally no difference wrt purity.
In other words: we don’t write pure functional code so we can be all smug about our superior looking coding style; we do it because it creates opportunities for equational reasoning, because it makes code easier to truly understand, and as a consequence, makes it easier to write correct code.
Then; Haskell offers a wide range of equivalent but different-looking ways of writing the same code, and that is a good thing - it means we can pick the style that most closely matches what we want to say. Code is first and foremost a human-to-human language, and while it needs to “do the right thing”, that’s just the starting point - we also want code to express the programmer’s intentions, make it easy for a reader (possibly the original programmer’s future self) to retrace the thought patterns, knowledge, and assumptions that are encoded in it, see the structure of the problem being solved and how it maps to the machine side of the code, etc. Having many alternative ways of saying the same thing gives us more options to structure code in such a way as to reflect our mental model, beyond the technicalities of making the machine do the right thing.
Sometimes, we want to take the reader by the hand and walk them through a piece of code along a sequence of events. “First, we ask for a name, then we ask for a social security number, then we look up the name in a database, and then we report the result” - I’d write that in quasi-imperative style, hands down:
do
name <- readPrompt "Name"
ssn <- readPrompt "Social Security Number"
row <- query db (getUserByNameAndSSN name ssn)
print row
Other times, we might prefer to think of a series of monadic actions as a pipeline, and Kleisli arrows might be a better fit; an example might be chaining middlewares in an HTTP API, where each middleware takes a request and produces a response, either short-circuiting, or forwarding things to the next middleware:
myApp =
sessionMiddleware >=>
fullPageCacheMiddleware >=>
staticFilesMiddleware >=>
autoContentTypeMiddleware >=>
apiMain
Yet another common situation is where we need to gather a bunch of inputs, and then we construct some kind of data structure from them, like a record type. This is a staple in serialization/deserialization code, for example. Applicative style shines here:
getUser =
User
<$> getInt
<*> getString
<*> getEmail
And then we have the where
vs. let
debate - but this one is not specific to monadic code or do
notation, the same decision still needs to be made in non-monadic code, and AFAICT, there is no objective winner. Some people prefer a top-down approach, where you state the “big picture” first, and then the reader can, if they so wish, scroll down to read the definitions of the things you used in the big picture. You would generally use where
for this. Others prefer a bottom-up approach, where you start by presenting your building blocks, and then proceed to using and combining them, stating the “big picture” last. For this, you would use let
. (Of course there are also technical differences between where
and let
, most importantly scope, but those are relatively minor, and in most cases, the two can be used more or less interchangeably, modulo ordering).
Interestingly, the choice for top-down vs. bottom-up also has a cultural component to it. I learned about these cultural differences while working for a Dutch company that had just bought a German competitor, and I could witness the argument culture clash first hand (and it was occasionally hilarious). In a nutshell: Dutch argument culture starts with the conclusion and then provides the reasoning and evidence as needed; German argument culture starts with the evidence and then proceeds to reasoning and ends with the conclusion. “We should do X, because…” vs. “given the facts …, we should do X”. Both are valid, but if you are unaware of the cultural difference, the two can clash rather violently, with the Dutch thinking “get to the point”, and the Germans thinking “but where is the evidence”.
Personally, I’m a huge fan of having all these different styles at your disposal, and using whichever describes your thought patterns best. And if that’s a tie, I’d go with whichever style looks most straightforward, assuming a reader who is equally fluent in all styles. More often than not, this leads to a “mixed” style, and again, I think that’s a good thing. I’ll even mix let
and where
within the same function, if that helps me express myself better - e.g., I might introduce intermediate variables used in a multi-step calculation using let
, as they become relevant, so that the reader can follow the calculation as it’s built up, but I might factor out sub-calculations into local functions defined in a where
block, assuming that their names make it sufficiently clear what they are supposed to do, and putting the definitions upfront would be detrimental to the reading flow.