How can I better read and understand a section of Haskell code?x

Thank you, community.

I’ll use a minimal working example to frame my question:

module Main where

import Control.Applicative ((<|>))
import Text.Parsec (parse, many1, digit, string, oneOf, try, char, manyTill, anyToken, lookAhead, many, noneOf, between)
import Text.Parsec.String (Parser)

data Tree = Leaf String | Node [Tree]

instance Show Tree where
show :: Tree -> String
show (Leaf x) = x
show (Node xs) = "(" ++ concatMap show xs ++ ")"

brackets :: Parser Tree
brackets = Node <$> between
(char '(') (char ')')
(many (brackets <|> (Leaf <$> many1 (noneOf "()"))))

s = "(An example using (nested (parenthesis)))"

main :: IO ()
main = print (
	parse brackets "" s
)

Supposedly, brackets function is a Parser Tree which parses brackets (even nested ones). The first step is hovering my mouse over brackets definition, and I see (at least) three pieces of information:

  • Type signature
between :: forall s (m :: Type -> Type) t u open close a.
Stream s m t =>
ParsecT s u m open
-> ParsecT s u m close -> ParsecT s u m a -> ParsecT s u m a
  • Comment explaining function behaviour

between open close p parses open , followed by p and close . Returns the value returned by p .

  • Example

braces = between (symbol "{") (symbol "}")

(A tooltip definition displays in VSC Code, thanks to HLS language server, I think).
In order to improve my understanding, I could look at between, parse, <$> function definitions.
So, my question is:
what’s a good and practical route to understand what’s happening, for example, in brackets function ?

And a follow-up question is: in general, we speak about the importance of understanding types, but
are types enough to understand Haskell code, or we need comments to quickly reason without going to look at function definitions ?

I ask this because in that tooltip, function definitions are missing, so types seem to be more important than function definitions when reading Haskell code.

I’ll appreciate any thoughts from this community.

In general, if you’re looking at code someone else wrote, a good first step is to find out which library the function is from. In your example, the brackets function is from the parsec library.

There are some Haskell libraries that you probably need a tutorial to understand, because they expect you to use specific idioms that are difficult to guess if you haven’t seen them before. parsec is one of those libraries.

So, do you understand how functions like <$>, <|>, char or many1 are being used? If not, I recommend using Hoogle to search for the package on Hackage, which in this case is parsec, and scrolling down to the Readme section to look for tutorials. (In this case, information on using applicatives for parsing will be particularly relevant.)

If you’re already familiar with parsec-style parsing, the information you get from HLS should be much more useful. In this case your question needs a completely different answer, but I’d say that most of the time, looking at the three pieces of information (type signature, comment, example) in combination should be enough to work out what the function does.

To your follow-up question (“Are types enough?”), I’d say in general no, often you need a comment. But there are some cases where you can partly guess what a function does just from the type signature, and rare cases where the type is all you need to understand the function completely.

5 Likes

Thank you so much, @gcox!

I have done that and read more about Parsec in the documentation. I have a question about the reading:

Where would you start reading this definition in order to understand it? Would you take the rightmost expression:

many1 (noneOf "()")

then

Leaf <$> many1 (noneOf "()")

to see what value it defines, then

brackets <|> (Leaf <$> many1 (noneOf "()"))

and lastly, Node being fmaped over Parser [Tree] that is next to it?

I struggle sometimes with deciding the reading flow of a definition like brackets. I would appreciate any insights.

Here is how I would mentally break apart the brackets function:

= Node <$> ...

Ok the Node constructor is fmapped over the Parser so the result will be a node

between (char '(') (char ')') (...)

Here the naming in the documentation is very helpful: open, close, a - so opening character and closing round bracket.

many (*brackets* <|> ...) 

Ok it’s recursive that really is the key course it is because data Tree is also

Leaf <$> many1 (noneOf "()")

Lastly, what’s a leaf? It’s not a bracket. Makes sense

1 Like

I don’t think there’s a general rule for where to start when reading function definitions, but if it’s code you can’t understand easily, I suggest first breaking it down into small parts, like this:

brackets :: Parser Tree
brackets = Node <$> nodeContent
  where
    nodeContent :: Parser [Tree]
    nodeContent = between (char '(') (char ')') nodeBody
    nodeBody :: Parser [Tree]
    nodeBody = many (brackets <|> leaf)
    leaf :: Parser Tree
    leaf = Leaf <$> leafContent
    leafContent :: Parser String
    leafContent = many1 (noneOf "()")

Then, look at each piece (each sub-definition in the where block) and see if you understand what it does. Since this is code for a Parsec-based parser, it defines several simple parsers and combines them into a more complex parser. You can test your understanding of each definition by seeing if you can give English descriptions of what the defined parser does. For example,

    leafContent = many1 (noneOf "()")

parses “a non-empty string containing any characters except parentheses”.

Does that help?

1 Like