Parsing TeX output with Parsec

I must say first of all I’m quite new in Parsec. I am trying to write a parser in order to catch errors during the creation of a document via TeX. I have already written something using pcre-light, but I think it might be nice to learn some Parsec. The book “Real World Haskell” seems quite outdated regarding this topic. I read something on Hackage, and here is my attempt.

{- cabal:
    build-depends: base, parsec
-}

module TeXOutputsParser where

import Text.Parsec
import Text.Parsec.String

data TeXError = FileNotFound String
              | OtherTeXError String
  deriving (Show)

texError :: GenParser Char s TeXError
texError = do
  char '!' >> spaces
  try fileNotFound <|> otherTeXError <?> "tex-error-parser failed..."
  where
    fileNotFound :: GenParser Char s TeXError
    fileNotFound =
      FileNotFound <$>
        between (string "LaTeX Error: File `")
                (string "' not found")
                (many (noneOf "'"))
    otherTeXError :: GenParser Char s TeXError
    otherTeXError =
      OtherTeXError <$> many (noneOf "?")

texErrors :: GenParser Char s [TeXError]
--texErrors = many (noneOf "!") >> texError `sepBy` many (noneOf "!") 
texErrors = (eof >> return [])
        <|> (texError >>= \err -> (err:) <$> texErrors)
        <|> (many (noneOf "!") >> texErrors)

The question are:

  1. As you can see, I have commented one LOC, being that one of my early attempts. But it fails, see the output below for example. Why?
  2. Any suggestion? Can I make the code shorter and more “idomatic”?

Here is a sample output:

This is LuaHBTeX, Version 1.17.0 (TeX Live 2023) 
 restricted system commands enabled.
(./main.tex
LaTeX2e <2023-06-01> patch level 1
L3 programming layer <2023-06-16>
 (~/texlive/texmf-dist/tex/latex/base/article.cls
Document Class: article 2023/05/17 v1.4n Standard LaTeX document class
(~/texlive/texmf-dist/tex/latex/base/size11.clo))
(~/texlive/texmf-dist/tex/latex/fontspec/fontspec.sty
(~/texlive/texmf-dist/tex/latex/l3packages/xparse/xparse.sty
(~/texlive/texmf-dist/tex/latex/l3kernel/expl3.sty
(~/texlive/texmf-dist/tex/latex/l3backend/l3backend-luatex.def)))
(~/texlive/texmf-dist/tex/latex/fontspec/fontspec-luatex.sty
(~/texlive/texmf-dist/tex/latex/base/fontenc.sty)
(~/texlive/texmf-dist/tex/latex/fontspec/fontspec.cfg)))
(~/texlive/texmf-dist/tex/latex/kantlipsum/kantlipsum.sty)

! LaTeX Error: File `not-a-package.sty' not found.

Type X to quit or <RETURN> to proceed,
or enter new name. (Default extension: sty)

Enter file name: 
! LaTeX Error: File `not-a-package2.sty' not found.

Type X to quit or <RETURN> to proceed,
or enter new name. (Default extension: sty)

Enter file name: (./main.aux) (~/texlive/texmf-dist/tex/latex/base/ts1cmr.fd)
[1{~/texlive/texmf-var/fonts/map/pdftex/updmap/pdftex.map}]
(./main.aux))
 406 words of node memory still in use:
   3 hlist, 1 vlist, 1 rule, 2 glue, 3 kern, 1 glyph, 4 attribute, 48 glue_spec
, 4 attribute_list, 1 write nodes
   avail lists: 2:82,3:20,4:5,5:82,6:8,7:2054,8:1,9:72,10:3,11:144
<~/texlive/texmf-dist/fonts/opentype/public/lm/lmroman10-regular.otf
><~/texlive/texmf-dist/fonts/opentype/public/lm/lmroman12-regular.ot
f><~/texlive/texmf-dist/fonts/opentype/public/lm/lmroman17-regular.o
tf>
Output written on main.pdf (1 page, 12828 bytes).
Transcript written on main.log.

Thank you for any help.

Not entirely sure, but do you encode in your parse somewhere that the ! must be at the beginning of a line? If not, I think that would improve it.

The combinator sepBy expects its input to end with its first argument. In your case, every separator many (noneOf "!") would have to be followed by texError. If you replace sepBy with sepEndBy your one-liner works.

1 Like