If, say, I define data MyType = Cons1 String | Cons2 String Int, how is that different from defining data MyType = Cons1 String and then defining a function Cons2 :: String -> Int -> MyType? I understand that when declaring a constructor, you don’t actually specify what it “does”, but I’m having a hard time understanding the precise difference.
Your signature for cons2 (String -> Int -> MyType) suggests that jump from Cons1 to Cons2 without losing any info, akin to, say:
data Colour = RGB Int Int Int
| HSL Int Int Int -- Hue, Saturation, Luminosity
Is that correct?
In that case I would not use two constructors but just one, having helper functions to output specific representations (toRGB, toHSL :: Colour -> (Int, Int, Int).
Compare with another type like
data ChatMessage = IOT
| Handshake Int
| DirectMessage String
| ClientEnquiry Specs
⁝
where constructors carry different information, not just a different representation of the same data.
You’re correct that when you define data MyType = Cons1 String | Cons2 String Int, it does indeed create a constructor function that has the signature Cons2 :: String -> Int -> MyType (this can be verified in GHCi:
> data MyType = Cons1 String | Cons2 String Int
> :t Cons2
Cons2 :: String -> Int -> MyType
). But in order for MyType to behave as a data structure, you have to be able to “get the data back out” that you put into it. Cons2 as a data constructor has an additional property that a plain Cons2 function would not, which is that you can pattern match on it. Specifically, you can pattern match on a given MyType, and if that MyType was a Cons2, your pattern can extract both the original String and Int that you put into it. For example:
printMyType :: MyType -> String
printMyType (Cons1 str) = str
printMyType (Cons2 str i) = str ++ show i
Now if I say printMyType (Cons2 "saturn" 5), I’ll get back "saturn5" because I was able to use both the String data and the Int data that I put into it. Note that my definition of printMyType would not have compiled if MyType was setup like so:
data MyType = Cons1 String
Cons2 :: String -> Int -> MyType
Cons2 str i = ...
If I had used this setup, printMyType would only be allowed to have a branch for Cons1, and could only ever extract a single String value from a MyType. Cons2 wouldn’t be a valid case. Therefore any information about the Int that you “constructed” your MyType with would necessarily be lost.