Why is the length of a string in Foreign.C.String defined as Int and not as CSize?

The Foreign.C.String module contains the type

-- | A string with explicit length information in bytes instead of a
-- terminating NUL (allowing NUL characters in the middle of the string).
type CStringLen = (Ptr CChar, Int)

However, as far as I understand, string lengths are conventionally given as size_t in C, i.e., Foreign.C.Types.CSize in Haskell. E.g., that’s the type returned by strlen. As a result, my code is often riddled with calls to fromIntegral to convert from Int to CSize, which is annoying.

Is this just a historical accident, or does it have a deeper meaning? If the former, is there any chance this can be changed in future versions of base? How would one go about that?

2 Likes

There was some discussion earlier this year about Int being used in places where an unsigned type would be better. See these threads on Haskell Cafe:

If you have a concrete proposal to change base, take a look at core-libraries-committee/PROPOSALS.md at main · haskell/core-libraries-committee · GitHub.

1 Like

I think the posts @sjakobi links are suggesting changing Int to Natural, because that is theoretically the proper type of these functions. But using CSize for compatibility with C is mostly different matter in my opinion.

An argument against size_t is mentioned in the great presentation “What about the natural numbers?” (at 36:44). The summary is that size_t should probably not be used because implicit casting signed to unsigned and vice versa is not a very good idea. But I don’t think that applies here, because Haskell doesn’t have implicit casting.

1 Like