Understanding `mempty`

I’m doing some exercises on writing instance of Semigroup & Monoid and finding it hard to relate to mempty.

It looks to me that there is no default implementation of mempty. Is that the case?

However, as I look at Monoid instances in GHC (at least the ones I can somewhat relate to) I’m often left wondering what mempty really is.

I totally get this

instance Monoid [a] where
  mempty = []
  mconcat xss = [x | xs <- xss, x <- xs]

The answer seems to be that mempty is what it is defined as in the instance. But then I saw this and was left wondering:

instance Monoid b => Monoid (a -> b) where
  mempty _ = mempty
  mconcat = \fs x -> mconcat $ map (\f -> f x) fs

Trying to guess what that means … For any function (a -> b) where a is mempty the function is mempty.

3 Likes

The Monoid instance for function is telling you:
a -> b can be made instance of Monoid as long as the result type b has an instance of Monoid. This instance then gives you mempty :: a -> b which is a function that works like this: it ignores its argument and returns mempty :: b

Few concrete examples in GHCi

λ> import Data.Monoid
λ> (mempty :: Int -> [Bool]) 123
[] -- In this case mempty refers to a function that ignores its Int argument and returns mempty :: [Bool] which is empty list
λ> (mempty :: String -> Any) "abc"
Any {getAny = False} -- The function ignores its argument and returns mempty :: Any
-- A more complex example showing chaining of multiple Monoid instances
-- Monoid ()
-- Monoid () => Monoid (IO ())
-- Monoid (IO ()) => Monoid (Bool -> IO ())
-- Monoid (Bool -> IO ()) => Monoid (Int -> Bool -> IO ())
-- Monoid (Int -> Bool -> IO ()) => Monoid (String -> Int -> Bool -> IO ())
λ> (mempty :: String -> Int -> Bool -> IO ()) "hi" 1 True
λ> -- ^ This function is basically equivalent to (\str int bool -> return ())
     -- this stuff can sometimes save you from typing few characters :-)
1 Like

So you’re correct in thinking that mempty is whatever it is defined to be in the instance (which of course must be of the appropriate type).

What you’re seeing in the instance for Monoid (a -> b) is not an exception. Let me explain what’s happening.

To define mempty for Monoid (a -> b), we must provide a value of type a -> b (ie a function from a to b), just as for the monoid for [a], we must provide a value of type [a]. (Let me know if that makes sense).

So, writing out the definition, we have:

instance Monoid b => Monoid (a -> b) where
  mempty = memptyForFunctions
      where 
        memptyForFunctions :: Monoid b => a -> b
        memptyForFunctions x = undefined
 

  mconcat = ...


where I’m using undefined as a placeholder for the eventual code. What type does the value replacing undefined have to have? Since memptyForFunctions has type a -> b, undefined needs to have type b.

Fortunately, we have a mempty instance for b, so we can use that here.

tldr: the mempty on the left hand side of the equation is using the a->b instance, but the one on the right is using the b instance.

2 Likes

I don’t really get that yet :(. As I think through you post I’m getting more questions.

1. What is the _ holding a place for? The function?

The monoid for [a] has

mempty = []

It doesn’t have the _ (whatever placeholder) like monoid for (a -> b).

mempty _ = mempty

Perhaps like:

> mempty (+1)
()

2. Since the monoid for [a] is [] why does this return ()?

> mempty [3]
()

Oh, I see that

> mempty [3] == []
True

So not sure what is happening there.

3. If mempty _ = mempty doesn’t there need to be a definition for mempty on the RHS
I realize you said it is using b but I didn’t understand that.

I don’t really understand “unit” (). Does my lack of clarity on that relate to the original question?

I think when you command mempty [3], GHCi provides some default types to make it work. So, I think that is the equivalent of commanding:

(mempty :: [Integer] -> ()) [3 :: Integer]

which is why it evaluates to ().

What is the _ holding a place for? The function?

Good question. Are you clear on what _ is holding a place for in the following?

foo :: Bool -> Int
foo _ = 5

(Answer, it’s a wildcard for the input variable, which is of type Bool. The presence of _ makes clear that the input is never used.

Since the monoid for [a] is [] why does this return ()?

Ah, I see your confusion. What type does the mempty instance have for [a]?

(Answer: it has type [a]).

Try entering mempty :: [Int]. You’ll get back []. Does this make sense to you? Note that mempty is not taking an argument. That’s important.

When you write mempty [3], Haskell sees that you have given mempty an argument, so it infers that the instance of Monoid you must mean is the one for functions. In particular, it assumes it’s the Monoid instance for () -> [Int]. That’s why it gives you ().

When you write mempty [3] == [], it infers it’s the instance for [a] -> [Int], which is why it computes [] and returns True.

I don’t really understand “unit” (). Does my lack of clarity on that relate to the original question?

It doesn’t, but I highly recommend trying to understand it. I was confused by similar things when learning Haskell and I deferred trying to understand them, and it confused me a lot.

Here’s how to understand (). It is a type, and there is a single value that has that type. That value is also written ().

For example try entering () :: () into ghci, or () == ().

If mempty _ = mempty doesn’t there need to be a definition for mempty on the RHS

Yes, I understand your confusion. But to emphasize: the mempty on the left and the mempty on the right are different functions. The one on the left belongs to the Monoid instance for a -> b, but the one on the right belongs to the Monoid instance for b.

Addendum:

More on (). It’s the type with one value inhabiting it. There’s also a type Void with no values inhabiting it. Either () a is isomorphic to Maybe a. Either Void a is isomorphic to a.

It turns out there’s a correspondence here:

() corresponds to 1
Void corresponds to 0
(a,b) corresponds to multiplication
Either a b corresponds to addition
a → b corresponds to exponentiation.

Happy to expands on this if it interests you - I think it’s the best way to understand types.

4 Likes

If you command :set -Wtype-defaults, GHCi will tell you what default types it is supplying when it does that. (Command :set -Wno-type-defaults to turn that back off.)

It is off the beaten track for your interest in mempty, but GHCi’s type defaulting is written up here: 3. Using GHCi — Glasgow Haskell Compiler 9.4.2 User's Guide.

I got to admit, that’s not intuitive.

It is not, but it’s consistent, in the sense that there isn’t another behaviour that would make sense. The fundamental reason it’s confusing is that Haskell infers the type, and the type (in this case) determines the value.

2 Likes

It’s worth asking: what would it mean to have a default version of mempty? What would it be?

That makes perfect sense now.

foo :: Bool -> Int
foo _ = 5
> foo "hello"
• Couldn't match type ‘[Char]’ with ‘Bool’

> foo True
5

Is there a way, other than understanding the types and looking a the documentation, to know which instance is being used? I.e., could I check if

numToString :: (Num a, Show a) => a -> String
numToString x = show x

> mempty numToString 5
()

is using the Monoid (a -> b) instance?

I’m not actually sure, but in general, looking at the types tells you unambiguously. For example, to work out what’s happening with foo = mempty 5, I can write foo = undefined 5, and mouseover undefined to have VSCode display the type that this thing must have.

If I replace undefined by mempty, and it compiles, then I know that mempty has this same type, and so must belong to the Integer -> () Monoid instance.

By the way, when I write foo = mempty 5 in VSCode without a type, a get:

This error is saying: "I don’t know what type mempty has, so I can’t work out the value.

By the way, this is probably the most useful advice I could give about learning Haskell: whenever you’re confused by what type an expression has in a program, replace it with undefined, and mouseover in VSCode: the compiler will just tell you its type.

When actually programming in Haskell, I usually sprinkle undefined (or _) all over the place, make sure that the whole program compiles, and then slowly start filling in all the undefineds. This is easy, because Haskell tells me what type they need to be. Similarly, when debugging a program that has a type error, I just replace successively larger chunks of it with undefined until it compiles. I can then narrow down until I find the error.

You can do this without VSCode (or any HLS IDE), but it’s particularly effective with it, because it’s so quick to just mouseover and see the type.

OK, now I see there are are two mempty

Constraint says b must be a monoid so it has to have an mempty.

instance Monoid b => Monoid (a -> b) where
         --------                 -

And then there is the mempty that is being defined in the current instance

mempty _ = mempty
------

How Haskell knows the mempty on the RHS is from the b feels mysterious.


Addendum :: More on () - yes I am keenly interested in that. Please expand.

How about we add a type signature for clarity and rewrite to

instance Monoid b => Monoid (a -> b) where
    mempty :: a -> b
    mempty = \_ -> mempty

Now it is clear that the right hand side mempty matches b. Wouldn’t you agree?

4 Likes

It knows because it does type inference. That is to say, whatever is on the right hand side of the equation has to have type b (because the mempty on the left has type a -> b).

Addendum :: More on () - yes I am keenly interested in that. Please expand.

Sure! So in general, a useful way to think (informally) about a type is as a set of values. For example, Bool corresponds to the set {True, False}. When we say True :: Bool, that’s like saying True is an element of the set {True, False}, or more colloquially, that True lives in the space of Boolean values.

Similarly, [Int] is the “space” of all possible lists of integers, so to say [1] :: [Int] is to say that the list [1] lives in this space.

The same holds for functions. For example, Int -> Int is the space of all functions from the space of integers to the space of integers. So the statement (\x -> x*2) :: Int -> Int is saying that the function \x -> x*2 lives in the space of integers.

Some of the common basic sets and operations on sets carry over to types in an illuminating way.

In particular, there is a set that has no values, commonly called the empty set, or the null set. This corresponds to the Haskell type Void. Void genuinely has no values in Haskell (modulo normal caveats), so there is no value a :: Void (modulo normal caveats).

There is then a set with just one value, and this corresponds to the Haskell type (). The single value that lives in the () space is also named () (Haskell often does this “pun” where values and their types share a name, which can be quite confusing if you’re not used to it). So we can write () :: ().

Things get more interesting when we turn to operations on sets. The cartesian product takes two sets a and b, and returns a x b, the set of all pairs with the left element from a and the right element from b. In Haskell, there is a corresponding notion. Given two types a and b, we have a type (a, b). For a concrete example, take the types Int and Bool. We then have a type (Int, Bool) and indeed, we have e.g. (5, True) :: (Int, Bool). In other words, (Int, Bool) is the type of pairs of an Int on the left and a Bool on the right.

Similarly for sets, there is an operation called disjoint union, which takes two sets and gives you the set containing the elements from both. In Haskell the corresponding notion, for two types a and b is Either a b. So for example, Either Int Bool contains all the elements from Int and also all the elements from Bool. For instance Right True :: Either Int Bool, and Left 5 :: Either Int Bool.

Finally, there is the set of functions from a set a to a set b. Naturally in Haskell that corresponds to the type a -> b.

OK, now for the arithmetic part. Let’s count the number of elements in various types.

size(Void) : 0
size( () ) : 1
size( (a,b) ) = size(a) * size(b)
size( Either a b ) = size( a ) + size (b).
size ( a -> b ) = size ( a) ^ size ( b )

If two types have the same size, they are isomorphic (i.e. they are basically the same, because we can map back and forth between them without losing information) so the above is often useful. For example, size(Either () a ) = 1 + size(a) = size(Maybe a), so Either () a is isomorphic to Maybe a.

1 Like

– Better - thank you!

See what you are saying. Not sure yet how the arithmetic part will help me understand types other than seeing the number of possibilities.

I’m wondering why getAny is always false? Any is new to me.

Here there is mempty for “abc” which is [] and I guess that is False since it isn’t True

> (mempty :: String -> Any) "abc" 
Any {getAny = False} 

But I don’t think there is mempty for Bool

> (mempty :: Bool -> Any) True
Any {getAny = False}

… or for Int

> (mempty :: Int -> Any) 123
Any {getAny = False}

In your examples, you are giving mempty an argument, but I think this might be a confusion. If there were a mempty instance for Bool, it would have type Bool, not type Bool -> Any. Try typing:

mempty :: Bool

and you’ll get No instance for (Monoid Bool), which confirms your suspicion that Bool doesn’t have a Monoid instance.

Now try

import Data.Monoid

mempty :: Any

you’ll get Any {getAny = False}, because this is the definition of mempty supplied by the Monoid instance for the type Any.

tldr: for a Monoid instance of some type a, mempty is of type a, so in general it isn’t a function, unless a is a function type.