I’m writting a HTTP parser using Parsec, and this is my current definition for parsing the HTTP status line
statusParser :: Parsec ByteString a HTTPStatus
statusParser = do
string "HTTP/"
version <- choice [string "1.0", string "1.1", string "2.0"]
<&> convert
>>= onNothing (fail "Unknown HTTP version number, expected 1.0 1.1 2.0")
space
code <- sequence [digit, digit, digit]
<&> convert
>>= onNothing (fail "Invalid status code, expected 3 digits")
space
message <- if version == 2.0
then pure Nothing
else manyTill (choice [upper, space]) (Text.Parsec.Prim.try crlf)
<&> Just
pure $ HTTPStatus { .. }
My problem is that with choice
or using directly <|>
version only works for string "1.0"
and string "2.0"
, but not "1.1"
. I assume it’s because of their overlapping prefix, as the first parser that starts matching something is the one that is allowed to “run all the way”.
I can think of workarounds around the issue, such as moving that parser out, expecting 3 characters (digit
, char '.'
, digit
) and then fail if it isn’t one of the choices. However, I find the current format more straighforward to write and parse mentally. Is there a minimally intrusive way to make this type of pattern work?
convert
is just a custom function that allows me to unpack Text/ByteString to String and call readMaybe on that result. onNothing
to raise a failure and unwrap the value from Just on success.