CIS 194 | Help on non-exhaustive pattern

Super new to Haskell and I’m loving the language.
I’m doing the homework exercises in CIS194, particularly, Homework 2 - Exercise 2,

Lecture 2 provides Log.hs and I am to write LogAnalysis.hs

Log.hs

-- CIS 194 Homework 2

module Log where

import Control.Applicative

data MessageType = Info
                 | Warning
                 | Error Int
  deriving (Show, Eq)

type TimeStamp = Int

data LogMessage = LogMessage MessageType TimeStamp String
                | Unknown String
  deriving (Show, Eq)

data MessageTree = Leaf
                 | Node MessageTree LogMessage MessageTree
  deriving (Show, Eq)

-- | @testParse p n f@ tests the log file parser @p@ by running it
--   on the first @n@ lines of file @f@.
testParse :: (String -> [LogMessage])
          -> Int
          -> FilePath
          -> IO [LogMessage]
testParse parse n file = take n . parse <$> readFile file

-- | @testWhatWentWrong p w f@ tests the log file parser @p@ and
--   warning message extractor @w@ by running them on the log file
--   @f@.
testWhatWentWrong :: (String -> [LogMessage])
                  -> ([LogMessage] -> [String])
                  -> FilePath
                  -> IO [String]
testWhatWentWrong parse whatWentWrong file
  = whatWentWrong . parse <$> readFile file

I already accomplished exercise 1

{-# OPTIONS_GHC -Wall #-}

module LogAnalysis where

import Log

-- Exercise 1
parseMessage :: String -> LogMessage
parseMessage strLog = case words strLog of
    ("I" : ts : logString) ->
      LogMessage Info (read ts) (unwords logString)
    ("W" : ts : logString) ->
      LogMessage Warning (read ts) (unwords logString)
    ("E" : etype : ts : logString) ->
      LogMessage (Error (read etype)) (read ts) (unwords logString)
    _ -> Unknown strLog

parse :: String -> [LogMessage]
parse strLogs = map parseMessage (lines strLogs)

and now for exercise 2, i am to insert LogMessage nodes to a MessageTree binary search tree

this is my implementation currently

-- Exercise 2
insert :: LogMessage -> MessageTree -> MessageTree
insert (Unknown _) msgTree = msgTree
insert logMsg@(LogMessage{}) Leaf = Node Leaf logMsg Leaf
insert 
  logMsg@(LogMessage _ ts _) 
  (Node leftTree nodeLogMsg@(LogMessage _ nodeTs _) rightTree)
    | ts < nodeTs = Node (insert logMsg leftTree) nodeLogMsg rightTree
    | otherwise = Node leftTree nodeLogMsg (insert logMsg rightTree)

To my problem, ghci is warning me on ‘incomplete-patterns’

LogAnalysis.hs:20:1: warning: [-Wincomplete-patterns]
    Pattern match(es) are non-exhaustive
    In an equation for ‘insert’:
        Patterns of type ‘LogMessage’, ‘MessageTree’ not matched:
            (LogMessage _ _ _) (Node _ (Unknown _) _)
   |
20 | insert (Unknown _) msgTree = msgTree
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^...

But from my understanding of my implementation, all LogMessages of Unknown types wouldn’t be inserted in a MessageTree in the first place. I understand that the compiler might not be able to infer this runtime behavior but is there a way to “tell” the compiler that I don’t have to match that pattern?

3 Likes

The most straightforward way would be to define a new data type for log messages that doesn’t have the Unknown constructor, for example:

data KnownLogMessage = KnownLogMessage MessageType TimeStamp String
  deriving (Show, Eq)

data MessageTree = Leaf
                 | Node MessageTree KnownLogMessage MessageTree

EDIT: corrected thanks to @jaror

3 Likes

If you are very confident that you are correct then you can just use error:

insert LogMessage{} (Node _ Unknown{} _) =
  error "impossible: unknown node in log messages"
2 Likes

While you can probably get away with error here, you don’t know that you’re the only one creating MessageTrees. Someone could construct a broken tree and crash your program. If I was doing this exercise “for real”, I’d use @tomjaguarpaw’s approach, and probably also change the LogMessage constructor to LogMessage KnownLogMessage.

Another alternative is to return Maybe MessageTree, and return Nothing when the tree contains unexpected nodes, but that a) pushes more responsibility downstream (though this is not too bad, after you pick up the tools provided by Functor/Applicative/Monad) and b) breaks the written constraints of the exercise.

1 Like

I’m surprised to find myself in disagreement with the other posters here! You can make insert a total function (no incomplete warnings, no use of error or undefined) without changing any types or changing its behavior on the subset of msgTrees that don’t contain Unknown:

insert (Unknown _) msgTree = msgTree
insert logMsg@(LogMessage{}) Leaf = Node Leaf logMsg Leaf
insert 
  logMsg@(LogMessage _ ts _) 
  (Node leftTree nodeLogMsg@(LogMessage _ nodeTs _) rightTree)
    | ts < nodeTs = Node (insert logMsg leftTree) nodeLogMsg rightTree
insert logMsg (Node leftTree nodeLogMsg rightTree) = 
  Node leftTree nodeLogMsg (insert logMsg rightTree)

A set of guards (| ts < nodeTs) doesn’t have to be exhaustive in Haskell, as long as the overall set of patterns is. If that guard is reached and ts is not less than nodeTs, the program will proceed to checking the next function clause.

4 Likes

Thank you everyone for the help! I learned a lot. I wouldn’t have thought to construct a new data type, or use error as a catch. I was thinking about using Maybe to unwrap the value though.

@rhendric’s solution seems to be the one I’m looking for. I just thought that guards needed the otherwise branch in order for the function to be total.

1 Like

Is this a typo? I think you meant:

data MessageTree = Leaf
                 | Node MessageTree KnownLogMessage MessageTree

Oh thanks, that’s what I should have written.

I have just added insert _ tree = tree at the end like this:

insert :: LogMessage -> MessageTree -> MessageTree
insert (Unknown _) tree = tree
insert message Leaf = Node Leaf message Leaf
insert message (Node left treeMessage right)
  | getTime message <= getTime treeMessage = Node (insert message left) treeMessage right
  | getTime message > getTime treeMessage = Node left treeMessage (insert message right)
insert _ tree = tree

This is good catch all metod.

If you evolve the data types, this function will silently ignore new constructors, which might not be what you want. I generally prefer (when it’s not too hard) to enumerate a set of patterns which cover all cases, so that when I change a type, the compiler reminds me to consider each use site.

1 Like