Monadic Code In Haskell

tomjaguarpaw · October 13, 2023, 7:16pm

Strongly agreed, as those who saw my answer to Jose Valim’s challenge will probably guess.

atravers · October 13, 2023, 8:53pm

…along with all the other “linglyphics” associated with effect-centric code in Haskell e.g. from Applicative, et al.

Agreed: one person’s moment of cuteness or cleverness can often look like “write-only code” to others - we cannot assume that everyone who uses Haskell these days have the years or decades of experience that those of us who were there “in the early days” have now accumulated.

So as ugly as Haskell’s “two-tone” syntax (for regular and imperative Haskell) is, for now using do-notation is the least ugliest (it lessens the need for those linglyphics).

atravers · October 14, 2023, 12:04am

Another cost of mixed styles is more frequent “switching” between:

“expression-style” contexts, (which use (>>=), (>>) and the rest),
and “statement-style” ones (using do and <-);

…in order to understand the resulting code - not only is this an acquired skill for most, but the need to remember which context is being used is still an expense to be avoided, no matter what the magnitude of Haskell experience.

But this should not be thought of as an endorsement by me of do-notation: if I could go back to 1996, we would all be writing something more like:

program :: IO ()
program = do () | let a = 3,
                  let b = 4,
                  print (a + b)

i.e. do-blocks would have the syntax of list comprehensions, but without the brackets (those being reserved only for actual list comprehensions).

Liamzy · October 14, 2023, 7:28am

To specify, I think this is an important problem because for a lot of applications (but not all, or even most), Haskell naturally ends up with a lot of monadic code and relatively little pure code (not imperative shell, functional core, but imperative bones, functional flesh).

When it comes to #1, the problem comes down to simply not being good enough.

Yes, Haskell has a notion of effect, but the way I describe do-notation is as “somewhat more verbose Python”, and even though Python is widely regarded as the most readable of all imperative languages, I’d expect that Haskell can do better.

That is why I’m interested in #2, #3, and #4; i.e, if a lot of Haskell code is going to be monadic, it’d better be a very pleasant monadic experience, otherwise the only thing you have to offer is having an explicit notion of effects.

As far as going for #2, #3, and #4, over #1, yes, I’m aware both of Do Notation Considered Harmful, as well as, I believe, Chris Allen mentioning that many Haskell learners go through a “**** do notation” phase.

And I agree with the last statement; i.e, avoiding do notation entirely is a phase people go through on their Haskell journey (I’m at least a year past this), but being purist on “imitation Cobol/Python”, is something I have to question too.

tomjaguarpaw · October 14, 2023, 8:24am

Sorry if I missed it, but I don’t think you explained this. You said 1 is “a waste” and “simply not good enough” but I don’t understand why you think that.

BurningWitness · October 14, 2023, 8:42am

When it comes to #1, the problem comes down to simply not being good enough.

It absolutely is good enough. Results are always produced on the left and every line describes a single step. It can be split or combined to the level of granularity wanted and it’s pleasantly uniform.

Using anything else to spice up the control flow immediately kills this simplicity and the resulting code is guaranteed to be harder to both read and alter.

the way I describe do-notation is as “somewhat more verbose Python”

Are you sure this is on the do-notation and not on the fact that virtually everything you do in Haskell is more verbose? You’ll need a mountain of sugar to compete in shortness with that.

prophet · October 14, 2023, 9:25am

Frankly, “do notation considered harmful” is at best misleading and at worst completely wrong. Again, there is nothing more imperative about do notation than about any other use of Monads.

If you don’t want to use monads, that’s fine I guess but don’t fool yourself into thinking that writing out obtuse operators for simple monadic code makes you any more “pure” than others.

Well, yes, that is exactly what monads are. (>>=) is literally the monadic equivalent of a let binding.

Liamzy · October 14, 2023, 10:46am

@tomjaguarpaw

#1 is a strawman, and I didn’t think this was going to be the approach everyone preferred. If I did, I’d have spent more time writing it.

The reason I think #1 is a waste, is because of the superfluous bindings, when it could have been written simply as print (3 + 4) or print 7.

Of course, making intermediate steps explicit is always useful when you’re dealing with a long function chain, but in the example given, it’s simply not useful.

As far as being “not good enough”, Haskell has a reputation for extremely elegant code. I see this quite often in the pure space, where the algorithm is extremely explicit and boilerplate-free, thanks to higher-order functions, recursion, and function composition.

When you get to do-notation, on the other hand, it’s hard to make the same apply.

At others:

The point is more, I’m trying to explore what is a strong way to write monadic code in such a way that readability and concision, if not necessarily accessibility, is better than Python.

I think Haskell can actually do this, but not via an apples-to-apples comparison.

The trick comes because in Haskell, it is, at least in some corners of the community, idiomatic to use >>=, <$>, liftA2 / <*>, before a do-notation bind (bar ← foo).

This is equivalent to using function pipelines or method chaining in more traditional languages, and that’s either controversial or smelly in traditional / imperative programming.

The whole monadic/applicative/factorial smorgasbord makes explicit what you’re doing, and even if it’s less concise or ends with weird operators, the explicitness cuts away the smell.

Here’s two programs, translated back and forth between Python and Haskell, which do the same thing. No, the exception control is horrible, but these are just samples showing what’s possible.

def main():
    file1_file_path = input()
    file2_file_path = input()
#
    file1 = open(file1_file_path).readlines()
    file2 = open(file2_file_path).readlines()
#
    if file1 == file2:
        print('Success!')
    else:
        print('Failure!')

import System.IO (readFile')

main :: IO ()
main = do
    file1FilePath <- getLine
    file2FilePath <- getLine

    file1 <- readFile' file1FilePath
    file2 <- readFile' file2FilePath

    if file1 == file2
        then putStrLn "Success!"
        else putStrLn "Failure!"

Okay, so that’s version one. Let’s see the same semantics, but ordered a bit differently:

import System.IO (readFile')
import Data.Bool (bool)

main :: IO ()
main =  liftA2 (==) (getLine >>= readFile') (getLine >>= readFile')
    >>= putStrLn . bool "Failure!" "Success!"

def main():
    print('success' if open(input()).readlines() == open(input()).readlines() else 'failure')

Okay, so in both versions, the Python version wins out in terms of concision. But when you’re looking at the 2nd Haskell version, it’s more concise than the 1st Python version, I’d say it’s more readable, and it’s idiomatic, at least among some Haskellers.

The 2nd Python version, on the other hand, I’d think is generally smelly because it doesn’t clearly indicate an order of effect, and will get you screamed at by the Python style purity brigade.

f-a · October 14, 2023, 11:23am

Wait for that angry code review!

Generally speaking I think with Haskell you can make good use of horizontal space, but let us not forget code is written once and read many times. I prefer version #1, what is happening is immediately clear you can sprinkle traceM later if needed, etc.

The more the function gets larger, the more I prefer unambiguous, simple code. if it gets too long, you can always move stuff to where; adding signatures helps too!

atravers · October 14, 2023, 11:42am

Unfortunately, that’s usually the case with most larger forms of syntactic sugar: they don’t provide a way to use certain “granules” as that would lead to syntax errors. An example is the use of <-, which some have thought was an Haskell operator that can somehow extract the result of any monadic expression (it isn’t and it doesn’t).

Considering that:

I think we can assume that the inspiration for Python’s procedural syntax was in large part the functional syntax in Miranda^(R) and Haskell - if I understand you correctly, you are now attempting to find a way to make Haskell’s procedural (do-notation) syntax as neat as Python’s. If so, then by way of “syntactic transitivity”, it seems you’re wanting a more functional syntax for monadic expressions, beyond either do-notation or regular Haskell expression/function syntax involving the various monadic operators - you’re taking on quite a challenge there!

Before you proceed any further:

To add:

In summary:

at the time, do-notation was considered to be the most-palatable form of syntactic sugar.
while it started out as the work of a few individuals, it entered standard Haskell by way of group approval.

prophet · October 14, 2023, 12:02pm

Well that is an interesting example because I absolutely disagree that the second Haskell version is more readable.
It just unnecessarily obscures the order of effects and I don’t think I’m in the minority with this opinion. In fact, you got it wrong yourself! ^^

These two versions aren’t actually quite equivalent.

The order of effects in the first one is (obviously because you can clearly read it off the order of statements) getLine -> getLine -> readFile' -> readFile'.

But your second version actually performs effects in the order getLine -> readFile' -> getLine -> readFile', which manifests itself if the first file is not found.

I also find the claim that this is “idiomatic” somewhat objectionable. Just as one data point from someone who definitely knows what she’s doing, the pipes tutorial (which is amazing btw) only uses the first stlye.

atravers · October 14, 2023, 12:26pm

From this example:

…it seems you prefer the “point-free” style - at the time, so did John Backus:

Can Programming Be Liberated from the von Neumann Style? … (1978)

Liamzy · October 14, 2023, 1:26pm

@prophet

Here’s another point I’ve gotten wrong. The Haskell version, if you ignore the usual “10 lines of extensions, 25 lines of imports” stuff Haskell code often incorporates (and here to a very tiny degree), is actually very slightly smaller than the Python version.

@atravers

To clarify, I prefer restricted point-free, i.e, I think point-free can easily become unreadable when the chain gets too long, but when used in moderation it’s beneficial.

If, say, the problem is “get two file names on the input prompt, try to compare the files represented, and return success if they’re the same and failure if not”, my actual preference would be:

import System.IO (readFile')

main :: IO ()
main = do
    file1 <- getLine >>= readFile'
    file2 <- getLine >>= readFile'

    putStrLn
        $ if file1 == file2
            then "Success"
            else "Failure"

This is a blend of the readability of #1 and the concision of #2.

As a limit of where I think point-free starts falling apart:

apiGet :: IO ()
apiGet = do
    res <- getChar 
    	>>= parseRequest . \case
            '1' -> urls !! 0
            ___ -> urls !! 1
        >>= httpLBS

    traverse_ putStrLn $ responseBodyBlock res
  where
    urls =
      [ "https://jsonplaceholder.typicode.com/todos/1"
      , "https://api.example.com/data"
      ]

    responseBodyBlock resource =
      [ "Response body:"
      , unpack $ getResponseBody resource
      , "OK1"
      ]

Just a snippet, a refactor of someone else’s code. Someone suggested me to take out do entirely, and turn it into a purely monadic pipeline.

But for me, it’s obvious that, even if I’m point-freeing to avoid some unnecessary bindings, the “get data” and “print data” are two separate steps, and just directly pipelining the traverse would make it significantly harder for me to read.

If this were to be turned into a pure pipeline, it’d follow the #4 model, where the steps are clearly delineated on the top-level, then have their implementation specified in the where clause.

And I’m already uncomfortable with the pipeline; it is at the maximum level I can tolerate. Anything more, I’d stuff a name on it and push it down to the where clause.

To summarize, then, I like, but do not push for, a semi-point-free style often mediated by do-notation.

atravers · October 14, 2023, 6:56pm

I think I now understand your perspective on this matter: Haskell is labelled as being a functional language, but more often or not you find yourself looking at vast expanses of rather-procedural code.

The first nonstrict functional language I learned about was Miranda^(R). So the transition to Haskell, and the already-widespread use of do-notation, was a slow one (it being more the result of necessity rather than preference). But I remained dissatisfied with Haskell’s imperative style for working with effects - over the last twenty years or so, I’ve learned to tolerate it, but not accept it (if many of the quotes, articles and references I’ve placed here and elsewhere didn’t already make that obvious ;-).

This remark:

…seems “about right” on this matter: these days, it seems most new Haskellers are far more interested in just “getting up and going”, which that simple “ultra-imperative” style allows them to do more quickly, particularly if they have prior exposure to imperativity. So for now, we need to write code that they can easily understand, even if we aren’t entirely satisfied with it.

Liamzy · October 14, 2023, 8:07pm

Yeah, but the point of this thread was to ask a question about how people style their monadic code, i.e, what kinds of best practices people had arrived at.

I’m just really disappointed that the most common answer was #1, that is to say, people seem to be seeking an imperativeness which is greater than that of Javascript, Python, F#, Elixir, and OCaml.

In the two mainstream languages, mild use of method chaining has become mainstream, and method chaining is somewhat similar to direct monadic bind in terms of style.

In the impure functional languages, pipeline operators are both used and abused, and there’s effectively experience around when pipelining is bad, as opposed to just trying to ban direct monadic bind altogether.

Ambrose · October 14, 2023, 8:26pm

Imperative Haskell code isn’t the same as imperative JS, Rust, Go, etc. The fact that it’s first class and has Monad etc instances means you can metaprogram Haskell imperative code very nicely. That’s where it shines imo.

atravers · October 14, 2023, 9:28pm

That’s where it shines […]

…like the proverbial “gilded cage” where I/O is concerned:

acowley · October 20, 2023, 1:14pm

I also think this blended or hybrid approach is preferable. To me, semi-point-free code absolutely can improve readability because it emphasizes certain concepts by assigning them names.

When you have a little pipeline whose intermediate state should be irrelevant to the context, assigning a name to that state is clutter. Extracting the little pipeline to a named top-level form is the alternative, but now you need to name that pipeline and that might be more confusing for a reader than letting them just see the familiar pieces it is composed from.

Liamzy · October 23, 2023, 10:21pm

I suspect that if you’re going hybrid, a better way might be to do it via manual liftA2:

import System.IO (readFile')

main :: IO ()
main =  (==)
    <*> getFile 
    <*> getFile
    >>= putStrLn . bool "Failure!" "Success!"
  where
    getFile = getLine >>= readFile'

The usage of fmap or <$> makes the code much more amenable to verticalization, which helps avoid excessively horizontal code.

Interestingly enough, on IRC I was complaining about how liftA2 and <*> can’t guarantee an order of effect, but it turns out that the 5th monad law (which means that ap = = <*>) guarantees an order of effect if the Monad instance exists for the Applicative type.

@acowley

The point of blended / hybrid / where style is to try to separate specification from implementation, to enable a fast perusal of code without caring for the implementation details unless something seems wonky. You can still name monadic actions, provided that they’re not the last line, by simply giving them a dummy bind (nameOfAction <- myActionChain), which, unfortunately, will make the compiler complain as the nameOfAction is unused.

You can even use this technique to name blocks of code, i.e, nameOfAction <- do... which can be more ergonomic than straight commenting.

tdammers · October 24, 2023, 6:19am

Two thoughts here.

First: the benefit of purity in Haskell is not about making it look pure - it is about making it easier to reason about the code, and whether you do or do not get this benefit only superficially depends on the syntax you choose to use for your monadic code. The key thing about pure code is the absence of (side) effects; and depending on how you look at it, monadic code is either effectful, or it’s not, but massaging it into a different surface syntax does not change that. getLine >>= putStrLn has exactly the same effects as do { x <- getLine; putStrLn x }, there is literally no difference wrt purity.
In other words: we don’t write pure functional code so we can be all smug about our superior looking coding style; we do it because it creates opportunities for equational reasoning, because it makes code easier to truly understand, and as a consequence, makes it easier to write correct code.

Then; Haskell offers a wide range of equivalent but different-looking ways of writing the same code, and that is a good thing - it means we can pick the style that most closely matches what we want to say. Code is first and foremost a human-to-human language, and while it needs to “do the right thing”, that’s just the starting point - we also want code to express the programmer’s intentions, make it easy for a reader (possibly the original programmer’s future self) to retrace the thought patterns, knowledge, and assumptions that are encoded in it, see the structure of the problem being solved and how it maps to the machine side of the code, etc. Having many alternative ways of saying the same thing gives us more options to structure code in such a way as to reflect our mental model, beyond the technicalities of making the machine do the right thing.

Sometimes, we want to take the reader by the hand and walk them through a piece of code along a sequence of events. “First, we ask for a name, then we ask for a social security number, then we look up the name in a database, and then we report the result” - I’d write that in quasi-imperative style, hands down:

do
    name <- readPrompt "Name"
    ssn <- readPrompt "Social Security Number"
    row <- query db (getUserByNameAndSSN name ssn)
    print row

Other times, we might prefer to think of a series of monadic actions as a pipeline, and Kleisli arrows might be a better fit; an example might be chaining middlewares in an HTTP API, where each middleware takes a request and produces a response, either short-circuiting, or forwarding things to the next middleware:

myApp =
    sessionMiddleware >=>
    fullPageCacheMiddleware >=>
    staticFilesMiddleware >=>
    autoContentTypeMiddleware >=>
    apiMain

Yet another common situation is where we need to gather a bunch of inputs, and then we construct some kind of data structure from them, like a record type. This is a staple in serialization/deserialization code, for example. Applicative style shines here:

getUser =
    User
        <$> getInt
        <*> getString
        <*> getEmail

And then we have the where vs. let debate - but this one is not specific to monadic code or do notation, the same decision still needs to be made in non-monadic code, and AFAICT, there is no objective winner. Some people prefer a top-down approach, where you state the “big picture” first, and then the reader can, if they so wish, scroll down to read the definitions of the things you used in the big picture. You would generally use where for this. Others prefer a bottom-up approach, where you start by presenting your building blocks, and then proceed to using and combining them, stating the “big picture” last. For this, you would use let. (Of course there are also technical differences between where and let, most importantly scope, but those are relatively minor, and in most cases, the two can be used more or less interchangeably, modulo ordering).

Interestingly, the choice for top-down vs. bottom-up also has a cultural component to it. I learned about these cultural differences while working for a Dutch company that had just bought a German competitor, and I could witness the argument culture clash first hand (and it was occasionally hilarious). In a nutshell: Dutch argument culture starts with the conclusion and then provides the reasoning and evidence as needed; German argument culture starts with the evidence and then proceeds to reasoning and ends with the conclusion. “We should do X, because…” vs. “given the facts …, we should do X”. Both are valid, but if you are unaware of the cultural difference, the two can clash rather violently, with the Dutch thinking “get to the point”, and the Germans thinking “but where is the evidence”.

Personally, I’m a huge fan of having all these different styles at your disposal, and using whichever describes your thought patterns best. And if that’s a tie, I’d go with whichever style looks most straightforward, assuming a reader who is equally fluent in all styles. More often than not, this leads to a “mixed” style, and again, I think that’s a good thing. I’ll even mix let and where within the same function, if that helps me express myself better - e.g., I might introduce intermediate variables used in a multi-step calculation using let, as they become relevant, so that the reader can follow the calculation as it’s built up, but I might factor out sub-calculations into local functions defined in a where block, assuming that their names make it sufficiently clear what they are supposed to do, and putting the definitions upfront would be detrimental to the reading flow.