If, say, I define data MyType = Cons1 String | Cons2 String Int
, how is that different from defining data MyType = Cons1 String
and then defining a function Cons2 :: String -> Int -> MyType
? I understand that when declaring a constructor, you don’t actually specify what it “does”, but I’m having a hard time understanding the precise difference.
Your signature for cons2
(String -> Int -> MyType
) suggests that jump from Cons1
to Cons2
without losing any info, akin to, say:
data Colour = RGB Int Int Int
| HSL Int Int Int -- Hue, Saturation, Luminosity
Is that correct?
In that case I would not use two constructors but just one, having helper functions to output specific representations (toRGB, toHSL :: Colour -> (Int, Int, Int)
.
Compare with another type like
data ChatMessage = IOT
| Handshake Int
| DirectMessage String
| ClientEnquiry Specs
⁝
where constructors carry different information, not just a different representation of the same data.
You’re correct that when you define data MyType = Cons1 String | Cons2 String Int
, it does indeed create a constructor function that has the signature Cons2 :: String -> Int -> MyType
(this can be verified in GHCi:
> data MyType = Cons1 String | Cons2 String Int
> :t Cons2
Cons2 :: String -> Int -> MyType
). But in order for MyType
to behave as a data structure, you have to be able to “get the data back out” that you put into it. Cons2
as a data constructor has an additional property that a plain Cons2
function would not, which is that you can pattern match on it. Specifically, you can pattern match on a given MyType
, and if that MyType
was a Cons2
, your pattern can extract both the original String
and Int
that you put into it. For example:
printMyType :: MyType -> String
printMyType (Cons1 str) = str
printMyType (Cons2 str i) = str ++ show i
Now if I say printMyType (Cons2 "saturn" 5)
, I’ll get back "saturn5"
because I was able to use both the String
data and the Int
data that I put into it. Note that my definition of printMyType
would not have compiled if MyType
was setup like so:
data MyType = Cons1 String
Cons2 :: String -> Int -> MyType
Cons2 str i = ...
If I had used this setup, printMyType
would only be allowed to have a branch for Cons1
, and could only ever extract a single String
value from a MyType
. Cons2
wouldn’t be a valid case. Therefore any information about the Int
that you “constructed” your MyType
with would necessarily be lost.