Need suggestion on code quality and best practices

Hello everyone. I am a Javascript developer and very much a newbie on Haskell. Though I kinda understand and write basic Haskell codes but the concepts like “monads” and some other things just goes over my head.

I am currently trying out “Servant” to learn it and writing a hobby web service with. Initially I was bit scared of understanding from their documentation but I got good confidence in it and can easily write basic CRUD apps with it now. While working on this project I needed a Faker library for it. Though I found one, I thought of creating one myself just to learn the ecosystem of modules or libraries in Haskell.

I have started to code one here → https://github.com/rabbihossain/nokol

Since I am very new to Haskell ecosystem, I am currently thinking if my approach is right or not. Can anyone please have look at my source and suggest if I need any improvements?

I think your idea of creating that library is pretty cool!

I’ve read through it all and have some comments:

The function signatures overall aren’t very easy to understand when the types and function name are put together.
What is this supposed to do? Why do I pass three Strings to get a random string in IO ?

getRandomString :: String -> String -> String -> IO String

And more examples like this…

getRandomIntss :: String -> String -> String -> Int -> IO [Int]
...

Sometimes, defining a type synonym can help make things clearer

Second, you rely a lot on Strings and Ints, but algebraic data types are amazing, and perhaps you could give more use to them.
The People module is a prime example to use a data type definition

data Person = Person String String Int

-- In your current model of "passing a lang to every function and return the value in IO"
getManyPersons :: String -> IO [Person]
getManyPersons = ...

-- And then you could, e.g. define all other functions through that one
getManyFirstNames :: String -> IO [String]
getManyFirstNames lang = do
    people <- getManyPersons lang
    return $ map (\(Person name _ _) -> name) people

And lastly, and probably most importantly, all functions are unnecessarily in IO because you read data from a file every time a function is called. This also requires all functions to have parameters like the language. This could be abstracted away, and passed around implicitly. Think of a Faker Monad :stuck_out_tongue: that could be used without IO (and just require IO to run it) – i.e., you read the data for a language once, and then pass it around – all faker calls can access the already loaded data for the chosen language. (Passing around a constant state is a common idiom, see the Reader Monad)

I’ll write what things could look like if you abstracted some concepts through a Monad instance:

fakeFirstAndLastNames :: Faker [(String, String)]
fakeFirstAndLastNames = do
    firstNames <- getManyFirstNames 10
    lastNames <- getManyLastNames 10
    return $ zip firstNames lastNames

main :: IO ()
main = do
    names <- runFaker "en_us" fakeFirstAndLastNames
    forM_ print names

The implementation should be a part 2 of this answer, it’s already too long, let us know … :slight_smile:

3 Likes

It’s quite common to use QuickCheck or hedgehog for writing tests which generate random samples of the test space. The advantage of them is that both can shrink examples and present smallest counterexamples. In QuickCheck this is done explicitly through implementatiohn of a shrink :: a -> [a] function, whilie in hedghog it’s automatic (which has its pros and cons).

There are libraries which generate various data types for QuickCheck, e.g. quickcheck-instances. Providing more newtype wrappers which can generate and shrink various data typeswould be an interesting addition. Note that shrinking usually must preserve some invariants, e.g. shrinked email address should still be a valid email address. And this could be tested by QuickCheck itself - so you’d learn something very useful.

1 Like

I am also thinking of this for taking parameter of function. For example -

getRandomString :: Lang -> Directory -> Keyword -> IO Strings

And for returning data I haven’t really thought of returning as an instance of a type. (Like your idea for returning Person type). I thought returning plain simple String and Int should be good but now now it makes sense to me from your example.

I literally didn’t know any good approach to bypass the IO monad but also thought about this too. This is actually the main reason to share my codes here. Since I don’t really know about Reader monad, I think I will dig into it.

I don’t know if I understood you comment correctly or not. What I need is, I need to generate a massive amount of real world fake data in various different topics based on a given user defined schema. Since I need to write a lot of function to do that, I thought of not including it directly to the project. Rather have a different library for those function. I will still look into QuickCheck library, since I need to write test cases for my codes. I will also look if I can borrow any ideas from it as you mentioned.

Thanks for your suggestions :smiley:

1 Like