Why doesn't UUID have a Bits instance?

It is just a 128 bit word after all.

I’ve done so many fun things by treating UUIDs as bits that it feels like the instance is obligatory.

It is just a 128 bit word after all.

It isn’t, UUIDs have a format and even the randomly-generated version 4 ones still use 4 bits to represent the version.

2 Likes

It’s like we should adopt BUID instead. All bits, no format.

newtype BUID = BUID UUID

instance Bits BUID where
-- etc

instance FiniteBits BUID where
-- etc
1 Like

Looking at the Data.Bits docs, I’m surprised tuples don’t have these instances.

…well then, why not something like:

newtype BinaryProgram = BinPrgm Integer

instance Bits BinaryProgram where
-- etc

instance FiniteBits BinaryProgram where
-- etc

(…and see how much fun can be had by treating our computer programs “as bits” !)


That UUIDs (or computer programs!) are encoded as large integral values is an “implementation artefact” e.g. if we were using Babbage difference engines, UUIDs and programs may very well be encoded as combinations of gears!

So, no: Bits instances for UUIDs should not be “obligatory”

You can solve multiple real world problems elegantly and efficiently by using UUID bits directly. A/B testing, sharding, certain types of idempotency keys are three ways I have this sort of thing running in prod, and I keep finding more uses. And each time I unpack the UUID to words and repack it back in.

I’ll just rework a UUID library of mine so that it solves something forall FiniteBits and provide the above newtype myself. Since it seems perfect lawful. Then people can benefit from the benefits of treating UUIDs as 128 bits without having to munge words around themselves like C programmers.

Are you sure you need all the Bits operations for this instead of like three specific ones? Separate functions can be as partial as you need them to be and you can document them nicely.

I need a few, true. But a glance at the classes make me think tuples of any arity have valid instances. Which in turn means UUID does.

…and there a probably a few other real world problems which could be solved by using binary-program bits directly - but we don’t.


[…] people can benefit from the benefits of treating UUIDs as 128 bits without having to munge words around themselves like C programmers.

Actually no, it would be worse - they’ll be munging words like assembly coders! Even C has the notion of structured values.

data UUIDInfo = UUIDInfo ...
                         ...
                         ...
                         ...

uuidInfo :: UUID -> UUIDInfo
uuidInfo uu = ...

bldUUID  :: UUIDInfo -> UUID
bldUUID uinf = ...

modifyUUID :: (UUIDInfo -> UUIDInfo) -> UUID -> UUID
modifyUUID f uu = bldUUID . f . uuidInfo

In this way there can be functions for manipulating the various components of UUIDInfo values as needed, thereby sparing people from the ugliness of flipping bits directly and correctly.


https://hackage.haskell.org/package/base/docs/Data-Bits.html#t:Bits

…since there are no actual Bits instances for tuples listed there, you should try to implement one yourself, rather that just assuming.

Are you suggesting using UUIDs this way isn’t reasonable or something? Because in distributed systems, the techniques I’m describing are quite effective. They’re actually humming along as we speak in the cloud :grin:

Should be pretty fun stuff! I’m envisioning property tests in my future.

Are you suggesting using UUIDs this way isn’t reasonable or something?

  • Abstraction - Cornell CS (2016)

  • Lecture 9: Data Abstraction (2016)

    (slide 6 of 17)

    Never violate the abstraction barrier!

Users of UUIDs should not have to know that they are implemented as bags of bits (or cogs)!

I don’t understand why this isn’t a general argument against almost all Bits instances. Bits Fd? A file descriptor just happens to be encoded in binary. Bits Int? We could have represented our integers in some other base, you know. Bits Bool? How foolish to assume that False and True mapping to 0 and 1 is anything other than an implementation detail.

That’s a confusing response, if it’s a response to me. Does the existence/possibility of ternary computation mean that you think the Bits class is fundamentally misguided and shouldn’t exist, or what?

It’s definitely a response to you:


I think that the use of the Bits class and its various instances should be kept private, behind appropriate domain-specific interfaces (hence the UUIDInfo suggestion). So if, at some point in the future, the use of trinary computation is more widespread then only those private details have to be modified, leaving the interfaces unchanged.

“Users” as in myself, the backend engineer working on some distributed system?

This just feels aesthetic to me. UUIDs are bits - it’s right there in the public interface.

Like…how do you expect me to do sharded range queries on UUID keys without using their bits?

So, you’d only use instances of a class—a language feature enabling ad-hoc polymorphism over multiple types—in functions that are specific to a single type? What’s the point of the class then?

I’m not arguing that a Bits UUID does or doesn’t make sense; I just can’t make sense of your argument against it, because UUIDs and Fds and Ints and Bools are all only ‘accidentally’ bits (which is to say, at the implementation level, they are definitely specified to be bits; but many programmers can operate at a level of abstraction where this is irrelevant). Your links about ternary computation seem to support this viewpoint; you agree that it’s accidental that we work with bits as opposed to trits, in general. So where’s the line between the accidents that we enshrine in Bits instances and the accidents that we don’t?

No, UUIDs are implemented as bits:


No, you’re the implementor (of your library). The users I was referring to were these:

…namely the users of that library: do they really have to know all the “gory details” of how UUIDs are implemented?


Well, I’m assuming that we don’t want:

instance Bits (a -> b)
instance Bits (IO a)
instance Bits (ST s a)

…those instances being on “the other side” of your designated “line”. But here’s another way to think about it - if our programs were running on Babbage difference-engines:

  • would we have a Cogs class?

  • if so, what types would we use it with?