Trying to understand the IO ()

Hi all,

I am slowing advancing with my Haskell skills, and I was able to the read yaml files and extract information from them according to a given custom data type. However, when I try to more some stuff inside a main = do to a function to make more compact the code there are some problems that indicated that probably I don’t understand the IO ().

This piece of code is working

{-# LANGUAGE DeriveGeneric #-}

import GHC.Generics
import Data.Yaml
import qualified Data.ByteString.Lazy as BSL
import GHC.Exception.Type
import Data.Typeable

import qualified Data.Text as DT

data Dataset = Dataset {
                    dataName     :: DT.Text
                  , dataKind     :: DT.Text
                  } deriving (Show, Generic)

instance FromJSON Dataset

main = do
  let fp = "test_single.yaml"
  ymlData <- BSL.readFile fp
  let mydata = decodeThrow (BSL.toStrict ymlData) :: Either SomeException [Dataset]
  result <-
    case mydata of
        Left exc -> error $ "Could not parse: " ++ show exc
        Right mydata -> return mydata
  print result

However, when I try to create a function like:

parseDataset mydataset = case mydataset of
                            Left exc -> error $ "Could not parse: " ++ show exc
                            Right mydataset -> return mydataset

readDatasets fname = parseDataset $ decodeThrow (BSL.toStrict yamldata) :: Either SomeException [Dataset]
            where yamldata =  BSL.readFile fname

main = do
  print $ readDatasets "test_single.yaml"

Then it’s not working.

Well, I tried <$> or fmap but it seems that I don’t know how to implement them when I have to indicate the type with :: Either SomeException [Dataset]

Probably, it is a lack of understanding of the IO () from my side. So I appreciate a lot any help with this piece of code, or any reference to read about.

Thanks!!

Note the feedback of the error is

    • Couldn't match expected type ‘BSL.ByteString’
                  with actual type ‘IO BSL.ByteString’
    • In the first argument of ‘BSL.toStrict’, namely ‘yamldata’
      In the first argument of ‘decodeThrow’, namely
        ‘(BSL.toStrict yamldata)’
      In the second argument of ‘($)’, namely
        ‘decodeThrow (BSL.toStrict yamldata)’

You can imagine IO is a container with some data. You can’t pass IO data to functions that expect the data without this “container”. But you can map over IO.

So yamldata returns IO ByteString, but BSL.toStrict expects BSL.ByteString. But you can map

BSL.toStrict <$> yamldata

But don’t forget that <$> doesn’t remove IO, it applies some function to the data inside IO. So you’ll get IO ByteString here. The next function, decodeThrow, expects a ByteString, but we have IO ByteString. And yes, we can map:

decodeThrow <$> BSL.toStrict <$> yamldata

The result will be in the IO again, IO (Either SomeException [Dataset]).
Your parseDataset has the signature Either SomeException a -> Either SomeException a. So it expects again the data without the IO. So we map

parseDataset <$> decodeThrow <$> BSL.toStrict <$> yamldata

And this is your function:

readDatasets :: String -> IO (Either SomeException [Dataset])
readDatasets fname = parseDataset <$> decodeThrow <$> BSL.toStrict <$> yamldata
            where yamldata =  BSL.readFile fname

If you run hlint on it, you’ll get a suggestion:

readDatasets :: String -> IO (Either SomeException [Dataset])
readDatasets fname = parseDataset . decodeThrow . BSL.toStrict <$> yamldata
            where yamldata =  BSL.readFile fname

hlint suggests to “combine” (or compose) the functions first (with .) and then map.

Note that you can use the do-notation in all monad-based functions (and IO is a monad), not only in main. So there is the third version:

readDatasets :: String -> IO (Either SomeException [Dataset])
readDatasets fname = do
    yamldata <- BSL.readFile fname
    return $ parseDataset $ decodeThrow $ BSL.toStrict yamldata

And the main function doesn’t have to be in do-notation. Since readDatasets returns IO, you have to map print over it:

main = print <$> readDatasets "test_single.yaml"
3 Likes

Very nice explanation!

For <$> do I have to import any specific library/module?

Thanks!!

No, it is a synonym for fmap from Prelude: https://hackage.haskell.org/package/base-4.12.0.0/docs/Prelude.html#v:-60--36--62-.

Hi Belka,

Just a quick question, the function

parseDataset mydataset = case mydataset of
                            Left exc -> error $ "Could not parse: " ++ show exc
                            Right mydataset -> return mydataset

was Either SomeException [Dataset] -> [Dataset]

but now it seems that it is not returning [Dataset] or IO [Dataset] but IO (Either SomeException [Dataset]). So it is possible that I have also to add new changes at parseDataset? As I need the type [Dataset] or IO [Dataset] to be processed by other functions.

Any idea?

Again many thanks.

Hi,

This function doesn’t do any IO, so it can just return [Dataset].
To get just [Dataset], remove return: Right mydataset -> mydataset.

return mydataset says: Return mydataset in some “container”, in some monad. How does the compiler know what monad to use? I defined the return type of readDatasets as IO (Either SomeException [Dataset]). We have IO already, so parseDataset has to return Either SomeException [Dataset]. The compiler determines that the monad should be Either. Therefore return mydataset is the same as Right dataset in that case.

After removing return you have to adjust the signature of readDatasets , it returns IO [Dataset].

2 Likes

First thank you for the help and your patience :grinning:
It seems it was hard to understand for me, but this code is now working, as @belka said (at the beginning I did not read correctly).

readDatasets :: String -> IO [Dataset]
readDatasets fname = parseDataset. decodeThrow . BSL.toStrict <$> yamldata
            where yamldata =  BSL.readFile fname

parseDataset :: Either SomeException [Dataset]) -> [Dataset]
parseDataset x = case x of
                    Left exc -> error $ "Could not parse: " ++ show exc
                    Right  x -> x

main = do
    check <- readDatasets "test_single.yaml"
    putStrLn ("type of check is: " ++ (show (typeOf check)))

It works. Thanks.

1 Like

This is hard stuff. Two years ago I spent several hours to write 3 lines invoking IO computations.

2 Likes