Improving Machine Arithmetic

mniip · January 25, 2026, 12:12am

Every once in a while I set out to write some code that involves e.g. some nontrivial index arithmetic, and I find the available toolbox for machine arithmetic clearly lacking. To be specific, I don’t mean any advanced number theory or anything, but rather simple things like addition and multiplication, viewed through the prism of representing mathematical integers on a computer: dealing with overflow, rounding, and other possible limitations of said representation.

For example, when adding two Ints, it can happen that the result is not itself representable as an Int: there is an “overflow” or “underflow”. Being a haskeller, I care about program correctness, and how my program responds to overflow is an important aspect of it. In different circumstances I may intend it to respond differently, here’s a few really common ways:

Wrapping: perform the addition modulo 2^bitwidth (which in the case of Int being a signed type implies using 2’s complement);
Trapping: if the addition was unsuccessful, terminate the entire computation (error).
Checked: let me branch (pattern match) on whether the addition was successful or not, potentially also let me branch on whether it was overflow or underflow, etc;
Saturating: return the Int that is closest to the mathematical result, meaning maxBound in case of overflow, and minBound in case of underflow.
Undefined: trust me it’s not going to happen, generate the fastest implementation possible with this assumption.

The base library does not provide any way for me to specify that I want one of these things to happen. Worse yet, the functions that base provides aren’t even documented to belong to one of these categories. Most operations such as (+) @Int are wrapping, but this is not documented anywhere. fromInteger @Natural cannot be wrapping so it is trapping, and so is fromEnum @Natural. But fromEnum @Integer is wrapping.

Still, the program needs to be written, and while one could make pedantically correct implementations of those behaviors to the tune of:

let result = toInteger @Int a + toInteger @Int b
in if
  | result > toInteger @Int maxBound
  -> -- handle overflow
  | result < toInteger @Int minBound
  -> -- handle underflow
  | otherwise
  -> -- use `fromInteger @Int result` which is now guaranteed to be safe

these would be embarrassingly slow. And besides program correctness I care about program performance. So what often ends up happening (I’ve seen it in my code and in others’), is that you reach for GHC.Prim.addIntC# and reinvent the wheel, over and over again.

This situation is basically on par with C, where there’s only one +, it does something and you have to always keep in mind what; and where the recommended trick (!) for detecting overflow is a < a + b.

In the long term I would like the situation in base to improve, but in the interim a package would be great too. Having a concrete and proven-usable interface is a great vehicle for getting changes merged into base.

One of the goals is uncompromising performance, and the implementations may be tricky to optimize, so it would be better to have a centralized place where these optimizations would live, rather than everyone optimizing their own wheel they reinvented. Notably, said optimizations will necessarily have to interact closely with the compiler internals, may require new primops, so maybe having them be maintained inside the GHC source tree would be better? On the other hand, introducing a bunch of primops for which we don’t have a good interface may not be a good idea either.

My search on hackage for prior art only found checked, likely hopelessly outdated. The closest in spirit is actually integer-logarithms, but that only provides logarithms and powers.

I would like to invite some discussion about this topic, what such a package could look like, what changes to base could look like, etc. Here’s some of my initial thoughts:

Prior to talking about typeclasses or other overloaded interfaces, we should talk about the monomorphic functions we’re trying to support. Having to explicitly spell out checkedAddInt32ToInt32 is better than having nothing to call at all because you couldn’t fit this function in a typeclass that mandates that the type must be a -> a -> a.
We may need more complex functions than the basic binary operators, but it’s difficult to define the exact scope.
- Some operations can be composed out of more basic building blocks, while others may not. For example, consider that a saturating multiply-add is different from a saturating multiply followed by a saturating add. But in wrapping mode they are the same. For another example, unsigned multiply-then-modulo never overflows, but is different than a wrapping multiply followed by a modulo.
- There’s always the reference implementation going through Integer. Some operations might admit much more performant implementations, while others may not. For example, unsigned multiply-then-modulo is like 2 instructions on x86_64.
- Some operations may have potential applications, while others do not.
Heterogeneous operations such as adding signed to unsigned, absolute difference, 64x64->128 bit multiplication – are also important.
It’s not clear to do with portability and in particular the Int size story. Depending on the platform it’s equivalent to either Int32 or Int64. And then there’s ancient comments in GHC.Prim about how an Int can be 30 bits due to tagging.
We should look to other standard libraries for inspiration:
- Rust: by far the best example, u32 has methods add, strict_add, checked_add, carrying_add, wrapping_add, unchecked_add, saturating_add, overflowing_add. There’s also a lot of non-basic operations worth considering.
- C++: has add_sat and ckd_add. Also midpoint is a non-basic operation worth considering.
- Zig: Integer Overflow, there’s builtin operators +, +%, +|, operations @addWithOverflow and math.add.

Bodigrim · January 25, 2026, 12:28am

For prior art there is int-cast: Checked conversions between integral types for safe casts. Also bytestring goes some length to catch overflows, grep for checkedAdd.

Bodigrim · January 25, 2026, 12:30am

Haskell packages can contain not only Haskell, but also CMM, which gets you very close to “your own primops” if you need, without hacking them inside GHC.

mchav · January 25, 2026, 8:49am

This would extend to other integral types in Data.Int, right?

tomjaguarpaw · January 25, 2026, 9:06am

That’s interesting. Are you referring to this inline-cmm library? Or maybe separate .cmm source files?

Bodigrim · January 25, 2026, 9:29am

Just .cmm files, they can be included under cmm-sources: in Cabal file.

Kleidukos · January 25, 2026, 11:37am

See this example: atomic-counter/Counter.cmm at master · sergv/atomic-counter · GitHub

TerrorJack · January 25, 2026, 1:17pm

speaking of checked arithmetic in ghc, a major problem is that it’s not comprehensive; it’s better to have:

explicitly sized int/word operand types
high level base library wrappers

it’s possible to diy one’s own checked arithmetic implementation using regular primops (e.g. see genericIntAddCOp in the compiler), but still it’s better to have primop coverage and different backends may be able to emit more efficient code.

@Kleidukos has already opened Adding checked arithmetic low-level functions (#22665) · Issues · Glasgow Haskell Compiler / GHC · GitLab a few years ago about this topic, thanks!

augustss · January 26, 2026, 11:53am

I agree with you. The Haskell report (and GHC) leaves it unspecified what happens on overflow. This was an intentional choice 35 years ago (in the name of efficiency). So that means that Haskell arithmetic is, sadly, in your last category “Undefined”.

I’m in the middle of changing MicroHs to do trapping arithmetic. It’s definitely not without prooblems, e.g., (-128)::Int overflows in the negation since it is negate (fromInteger @Int8 128). But I will give it a shot and see what happens. As with most things in MicroHs, it willk be an optional feature.

It would be great to have primitives for all the variants that you suggest. And we also should work towards making a better choice for the default.

olf · January 26, 2026, 4:50pm

I would consider the future behaviour of MicroHS a bug. The documentation of Num says that “(+) and (*) are customarily expected to define a ring” which implies that the operations are total.

augustss · January 26, 2026, 5:14pm

So you’d rather have arithmetic that silently gives you the “wrong” answer rather than scream at you? I don’t.

Since there are already several instances (Double, Float, Natural) that don’t obey the ring axioms, I feel that I’m in good company.

olf · January 26, 2026, 5:28pm

I consider that as the core problem: How rigorously are we expecting the already weakly stated arithmetic laws to hold? Even numeric-prelude falsely claims that Double is a ring. One way forward would be to have new typeclasses with high entry barriers for membership, e.g. class HonestToGodRingWithTotalAndLawfulOperations of which Integer but not Double can be a member.

ApothecaLabs · January 26, 2026, 7:19pm

See, I think part of the problem is that Haskell not only conflates eg the concept of a binary operation of a lawful ring / additive group / etc (which kind of has the same problems as having a privileged Monoid instance) with that of a more generalized concept of a forgetful / exceptional function that only approximates it within certain limits (I do support better checked arithmetic) - but we are also conflating the function ‘add’ with the symbolic operator ‘+’, because Haskell Num requires defining it as (+) which not only locks fallible numeric addition behind lawful additive groups, but it also steals the syntactic operator which requires hiding the entire Num hierarchy to deal with - thus partly contributing to the frankly rather insane number of custom operators endemic to advanced Haskell.

dandart · January 29, 2026, 1:56pm

I feel like this lends itself to newtypes for deriving via, e.g.:

newtype OverflowStrategy a = OverflowStrategy a

instance (Num a) => Num (OverflowStrategy a) where
-- ... do the overflow here ...

-- etc other strategies...

newtype MyCustomIntegral a = MyCustomIntegral a
    deriving Num via (OverflowStrategy Int) -- or whichever would compile...

or equivalent, then using coerce to go between them.

Maybe a library like that exists, maybe it’s worth experimenting with, or getting something like this into base.

Cheers

konsumlamm · January 29, 2026, 2:25pm

Rust actually does this, it has the Saturating and Wrapping newtypes for saturating and wrapping arithmetic, respectively. I think it would work even better in Haskell, since number literals can be overloaded. However the Haskell version of this would require FlexibleInstances.

jaror · January 29, 2026, 2:46pm

You could avoid FlexibleInstances by introducing an extra class:

newtype Wrapping a = Wrapping a
class WrappingNum a where
  ...
instance (WrappingNum a) => Num (Wrapping a) where
  ...

qqwy · February 2, 2026, 2:36pm

This is a great idea. I am very much for adding these as proper primitive functions to the language!

Two years ago, during ZuriHac 2024, I was working on a related concept: the Size library, which would be a better default than `Int` when dealing with sizes, lengths, offsets and indexes. The main idea there, is to use a 63-bit (31-bit on 32-bit machines) representation to remain backwards compatible with any `base` function that accepts an `Int`, as well as doing checked arithmetic in debug and test builds only, just like Rust.

To properly allow configuring this, I contributed a small feature to GHC to expose whether assertions are enabled or not as a `__GLASGOW_HASKELL_ASSERTS_IGNORED__ CPP macro define (users guide section, GitLab issue). It’s in GHC `9.12` and newer.

However, I did not feel very confident in whether my checked arithmetic implementation was correct, and I ran out of spare time available to refine the library, so it is not on Hackage yet. And it also has very limited scope: Only supporting `Word`/`Int` essentially.

I very much agree that improving the set of builtin functions is a good idea. Right now, we do have `addIntC#` for `Int` but not for most other types. And if you want to do safe multiplication, that function follows a completely different naming scheme: `timesInt2#` for some reason. And for many of the other native number types, the functions have either even more cryptic names or they do not exist at all. Let’s fix this!

Aside:

And then there’s ancient comments in GHC.Prim about how an Int can be 30 bits due to tagging.

I dove into the background of this a while back. It’s because of the Yale Haskell System (active between 1991 and 1993 according to the Haskell wiki), which is implemented on top of CMU Common Lisp. Much of the builtin types and garbage collection is outsourced to CMU Common Lisp’s own runtime, which uses a three-bit tagging scheme, with two tags reserved for fixnums, leaving you with 30 bits for the number.

EDIT: Pretty sure GHCJS, if run on a 32-bit system, actually breaks this guarantee as binary32 floats only have 23 bits (24 if counting the sign) of integer precision. Not that there are many people combining JavaScript with 32-bit machines anno 2026.

konsumlamm · February 2, 2026, 5:15pm

JavaScript running on a 32-bit system still uses binary64 floats, according to the standard.

augustss · February 2, 2026, 11:05pm

I really like that Haskell allows an implementation with tagged Ints. I’ve not used it yet, but I’ve considered trying it with MicroHs.

lylek · February 3, 2026, 7:12am

Yeah, that always bugged me as it seems inefficient. Aren’t there CPU flags that indicate overflow and underflow?