# ghci
GHCi, version 9.8.2: https://www.haskell.org/ghc/ :? for help
ghci> import Foreign.Marshal.Utils
ghci> (1 + 1 :: Int) == 2
True
ghci> with (1 + 1 :: Int) print
0x000000420042a428
ghci> with (2 :: Int) print
0x000000420042c248
ghci>
No surprise here - while expressions 1 + 1 :: Int and 2 :: Intreduce to the same normal form, Haskell has non-strict semantics. So those two expressions are represented internally by two different structured values: a different thunk for each expression. That’s why the results of with and withForeignPtr are I/O actions.
The byte array in question is zero-sized, there’s no good reason to ever follow that pointer.
Furthermore sharing empty values is a whole separate maintainer headache, see e.g. text#493.
Personally I’m in favor of data ByteString = BS1 ByteString1 | Empty, since most functions already do emptiness checks and ByteString1 has its own uses, same with Text. You’d need a lot of pushing to get something like that through, however.
Empty BS is represented by null pointer, “” is represented by null pointer
Empty BS is represented by pointer to nothing, “” is represented by pointer to nothing, the address of the pointer to nothing is shared
Empty BS is represented by pointer to nothing, “” is represented by pointer to nothing, the addresses of the pointers are different
Also, is the nothing, another null pointer, nothing (empty array of char), or a null terminated CString (array of char with 1 element but that strlen says that it is length 0)?
It doesn’t seem easy. I’d rather consider this a design challenge than a flaw.
Yes, and, this may also introduce a little performance degradation(I’m not sure).
Well, the current behaviour is acceptable since it’s a defined behaviour. However, it might be a good idea to put a warning remark in the documentation.
Yes, generally, this is fine. However, it requires users to be more careful. (Although dealing with pointers is always dangerous)
Consider the following example:
// c code
void f(char* p) {
if (p)
do something...
else
do another...
}
-- haskell
foreign import ccall unsafe "f" c_f :: Ptr Word8 -> IO ()
f :: ByteString -> IO ()
f bs = withForeignPtr (fst (BSI.toForeignPtr0 bs)) c_f
Normally, you may expect f BS.empty to be totally the same as f "", but actually, it will enter two different branches defined in the C code.
Consider using functions from Data.ByteString.Unsafe instead: they are in the stable part of the API and they are well-documented. In this case unsafeWithCStringLen does the same thing.
And, once again, you may not discard length when using ByteStrings this way because internally they are not null-terminated.
See also this issue about sharing 0 length ByteArray# objects. Ben makes the good point that users might depend on creating distinct 0-length objects for instance because they might use them with weak pointers.
users should not expect two different bytestrings containing the string “hello” to have the same pointer, and so by induction there’s no reason they should expect that two containing the string “” would either.
in general, pointer equality is never presumed to be synonymous with semantic equality.