Monadic Parsing in Haskell

This is the completed code for anyone interested:

-- {-# LANGUAGE NoImplicitPrelude #-}
module MonadicParsingInHaskell () where
import Control.Monad (liftM, ap)
import Data.Char 
import GHC.Types
import GHC.Base hiding ((++), many, MonadPlus) 
import Data.List (concat)
import Prelude ((-), (+),(*), div)
 
newtype Parser a = Parser (String -> [(a,String)])

instance Functor Parser where
  fmap = liftM
instance Applicative Parser where
  pure a = Parser (\cs -> [(a,cs)]) -- same as return
  (<*>) = ap

class Monad m => MonadZero m where
    zero :: m a

class MonadZero m => MonadPlus m where
      (++) :: m a -> m a -> m a

instance MonadZero [] where
  zero :: [a]
  zero = []

instance MonadPlus [] where
  (++) :: [a] -> [a] -> [a]
  [] ++ ys = ys
  (x : xs) ++ ys = x : (xs ++ ys)

instance Monad Parser where
      return :: a -> Parser a
      return a = Parser (\cs -> [(a,cs)])
      (>>=) :: Parser a -> (a -> Parser b) -> Parser b
      p >>= f  = Parser (\cs -> concat [parse (f a) cs' | (a,cs') <- parse p cs])

item :: Parser Char
item = Parser (\cs -> case cs of
            ""     -> []
            (c:cs) -> [(c,cs)])

p :: Parser (Char,Char)
p  = do {c <- item; item; d <- item; return (c,d)}

instance MonadPlus Parser where
  p ++ q = Parser (\cs -> parse p cs ++ parse q cs)

instance MonadZero Parser where
      zero = Parser (\cs -> [])

parse :: Parser a -> String -> [(a, String)]
parse (Parser p) = p

many   :: Parser a -> Parser [a]
many p  = many1 p +++ return []

many'   :: Parser a -> Parser [a]
many' p  = many p +++ return []

(+++) :: Parser a -> Parser a -> Parser a
p +++ q = Parser (\cs -> case parse (p ++ q) cs of
                               []     -> []
                               (x:xs) -> [x])

sat  :: (Char -> Bool) -> Parser Char
sat p = do {c <- item; if p c then return c else zero}

char :: Char -> Parser Char
char c = sat (c ==)

string :: String -> Parser String
string ""     = return ""
string (c:cs) = do {char c; string cs; return (c:cs)}

many1  :: Parser a -> Parser [a]
many1 p = do {a <- p; as <- many p; return (a:as)}

sepby :: Parser a -> Parser b -> Parser [a]
p `sepby` sep  = (p `sepby1` sep) +++ return []

sepby1 :: Parser a -> Parser b -> Parser [a]
p `sepby1` sep = do a <-p
                    as <- many (do {sep; p})
                    return (a:as)

chainl :: Parser a -> Parser (a -> a -> a) -> a -> Parser a
chainl p op a = (p `chainl1` op) +++ return a

chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a
p `chainl1` op = do {a <- p; rest a}
                    where
                       rest a = (do f <- op
                                    b <- p
                                    rest (f a b))
                                +++ return a

space :: Parser String
space = many (sat isSpace)

token  :: Parser a ->  Parser a
token p = do
    a <- p; space; return a

symb :: String -> Parser String
symb cs = token (string cs)

apply  :: Parser a -> String -> [(a,String)]
apply p = parse (do {space; p})

expr  :: Parser Int
expr = term `chainl1` addop

addop :: Parser (Int -> Int -> Int)
addop = do {symb "+"; return (+)} +++ do {symb "-"; return (-)}

mulop :: Parser (Int -> Int -> Int)
mulop = do {symb "*"; return (*)} +++ do {symb "/"; return div}

term :: Parser Int
term = factor `chainl1` mulop

factor :: Parser Int
factor = digit +++ do {symb "("; n <- expr; symb ")"; return n}

digit :: Parser Int
digit = do {x <- token (sat isDigit);  return (ord x - ord '0')}
3 Likes

Nice! If you want to see a very similar piece of self-contained parsing code, in slightly more modern style, I really like:

https://smunix.github.io/dev.stephendiehl.com/fun/002_parsers.html

1 Like

This is great, but I am again facing a few errors:

GHCi, version 9.4.8: https://www.haskell.org/ghc/  :? for help
[1 of 1] Compiling NanoParsec       ( Parsing.hs, interpreted )

Parsing.hs:75:21: error:
    • No instance for (Alternative f) arising from a use of ‘<|>’
      Possible fix:
        add (Alternative f) to the context of
          the type signature for:
            some :: forall (f :: * -> *) a. f a -> f [a]
    • In the expression: some_v <|> pure []
      In an equation for ‘many_v’: many_v = some_v <|> pure []
      In an equation for ‘some’:
          some v
            = some_v
            where
                many_v = some_v <|> pure []
                some_v = (:) <$> v <*> many_v
   |
75 |     many_v = some_v <|> pure []
   |                     ^^^

Parsing.hs:75:25: error:
    • No instance for (Applicative f) arising from a use of ‘pure’
      Possible fix:
        add (Applicative f) to the context of
          the type signature for:
            some :: forall (f :: * -> *) a. f a -> f [a]
    • In the second argument of ‘(<|>)’, namely ‘pure []’
      In the expression: some_v <|> pure []
      In an equation for ‘many_v’: many_v = some_v <|> pure []
   |
75 |     many_v = some_v <|> pure []
   |                         ^^^^

Parsing.hs:76:18: error:
    • No instance for (Functor f) arising from a use of ‘<$>’
      Possible fix:
        add (Functor f) to the context of
          the type signature for:
            some :: forall (f :: * -> *) a. f a -> f [a]
    • In the first argument of ‘(<*>)’, namely ‘(:) <$> v’
      In the expression: (:) <$> v <*> many_v
      In an equation for ‘some_v’: some_v = (:) <$> v <*> many_v
   |
76 |     some_v = (:) <$> v <*> many_v
   |                  ^^^

Parsing.hs:82:21: error:
    • No instance for (Alternative f) arising from a use of ‘<|>’
      Possible fix:
        add (Alternative f) to the context of
          the type signature for:
            many :: forall (f :: * -> *) a. f a -> f [a]
    • In the expression: some_v <|> pure []
      In an equation for ‘many_v’: many_v = some_v <|> pure []
      In an equation for ‘many’:
          many v
            = many_v
            where
                many_v = some_v <|> pure []
                some_v = (:) <$> v <*> many_v
   |
82 |     many_v = some_v <|> pure []
   |                     ^^^

Parsing.hs:82:25: error:
    • No instance for (Applicative f) arising from a use of ‘pure’
      Possible fix:
        add (Applicative f) to the context of
          the type signature for:
            many :: forall (f :: * -> *) a. f a -> f [a]
    • In the second argument of ‘(<|>)’, namely ‘pure []’
      In the expression: some_v <|> pure []
      In an equation for ‘many_v’: many_v = some_v <|> pure []
   |
82 |     many_v = some_v <|> pure []
   |                         ^^^^

Parsing.hs:83:18: error:
    • No instance for (Functor f) arising from a use of ‘<$>’
      Possible fix:
        add (Functor f) to the context of
          the type signature for:
            many :: forall (f :: * -> *) a. f a -> f [a]
    • In the first argument of ‘(<*>)’, namely ‘(:) <$> v’
      In the expression: (:) <$> v <*> many_v
      In an equation for ‘some_v’: some_v = (:) <$> v <*> many_v
   |
83 |     some_v = (:) <$> v <*> many_v
   |                  ^^^
Failed, no modules loaded.

I think most of the errors are to do with <|>, I also imported Data.Functor to use <$> as that is where it is defined(?). I have a feeling I need to create a Class or Instance in order to fix the errors, but that’s as far as I have got. In fact I think I need to create a Class/Instance for Applicative of some sort.

I couldn’t get it to compile, either way.

See the “possible fix” suggested by the error message:

E.g.

"
Possible fix:
add (Alternative f) to the context of
the type signature for:
some :: forall (f :: * → *) a. f a → f [a]
"

What that means is take the line:

many :: f a -> f [a]

and change it to:

many :: Alternative f => f a -> f [a]

The same for some.

Let me know if that helps.

(By the way, I think this is a great example of where an error message has the right information, but could be worded more clearly. If it said: "have you tried changing the type signature blah to baz?, it would have been clear what to do)

ps: here’s a quick example of how to use megaparsec, that I think will actually run (hopefully!) :slight_smile: Parsing - The Haskell Guide

1 Like

Yeah, I’ll try this out later and hopefully it’ll work. I probably should have just removed the type signatures to begin with.

I think it was the ‘forall’ that made things extremely unclear.

The error messages are very unclear and difficult to understand.

1 Like

I think a lot of people share your feeling that improving the clarity of error messages would be a good thing, but it’s quite hard to know exactly what to change. For example, how would you want the errors above to be phrased?

I think I’d want something like:

There's a problem with line 83, column 18:

You used <|>, but this requires that the type of `some_v`, 
which is currently `f a`, to have an Alternative constraint on f.

Try adding this information to the type, as in:

...

But even here, I struggle to make it totally clear. I’d definitely be interested in your thoughts on the best way to phrase and present the error.

Btw, there’s a really nice on-going project to catalogue errors, and I think your error is No instance arising [GHC-39999] — Haskell Error Index. I imagine this will lead in the medium term to editors being able to provide useful information about errors, via this index?

1 Like

For one, I’d say to always quote user written code directly instead of adding forall (f :: * -> *) a. to the type. And show the line number. So:

Parsing.hs:75:21: error:
    • No instance for (Alternative f) arising from a use of ‘<|>’
      Possible fix:
        add (Alternative f) to the context of the type signature:
          74 |  some :: f a -> f [a]

(I’m making 74 up)

Of course it is not obvious how to implement this because the type signature is not always so plainly written. But perhaps this is a good special case to implement, because it probably covers 99% of real world cases.

By the way, I think Issues · haskell/error-messages · GitHub is a good place to report issues with error messages like this.

Adding Alternative f => f a -> f [a] to both sum and many fixed the issues. Thanks. I also think I’m starting to understand Haskell error messages slightly better now too, so that’s good news.

2 Likes

Yeah, it’s definitely a big topic in the Haskell community. It’s a difficult one that should probably take influence from some other languages, but the type system being the way it is makes this difficult.

It will be improved in time, it’s just like people having problems with GHC.Stack.Types.HasCallStack (which I actually don’t think is that bad an implementation), it’s just difficult to use effectively.

There was a decent discussion around GHC.Stack.Types.HasCallStack over here, if you’re unfamiliar.