Only possible if you want the freeFinalizer
— in my case I have to use the destroy function provided by libtelnet
.
This is somewhat orthogonal to the question about ForeignPtr
: are the foreign import
types correct?
It seems to me that botan_init_hash
has the correct type Ptr (Ptr ()) -> ...
, but botan_hash_name
does not. I think the latter should be Ptr () -> ...
.
This isn’t exactly the code you were trying to write, but the following seems to return something sensible for me (48
for SHA-384
):
main = do
outputLength <- botanHashLength "SHA-384"
print outputLength
type OpaqueHash = Ptr Word8
foreign import ccall unsafe "botan_hash_output_length"
botan_hash_output_length :: OpaqueHash -> Ptr CSize -> IO Int
foreign import ccall unsafe "botan_hash_init"
botan_hash_init :: Ptr OpaqueHash -> Ptr CChar -> Word32 -> IO Int
botanHashLength :: String -> IO Int
botanHashLength string = withCString string $ \ptr -> do
ppHash <- malloc
botan_hash_init ppHash ptr 0
pHash <- peek ppHash
pSize <- malloc
botan_hash_output_length pHash pSize
fromIntegral <$> peek pSize
EDIT: This is very rough, but for setting up the ForeignPtr
properly, the following seems to work for me:
type OpaqueHash = Ptr ()
foreign import ccall unsafe "botan_hash_name"
botan_hash_name :: OpaqueHash -> Ptr CChar -> Ptr CSize -> IO Int
foreign import ccall unsafe "&botan_hash_destroy"
botan_hash_destroy :: FunPtr (OpaqueHash -> IO ())
foreign import ccall unsafe "botan_hash_output_length"
botan_hash_output_length :: OpaqueHash -> Ptr CSize -> IO Int
foreign import ccall unsafe "botan_hash_init"
botan_hash_init :: Ptr OpaqueHash -> Ptr CChar -> Word32 -> IO Int
newtype Hash = Hash (ForeignPtr ())
hashInit
:: String
-> IO Hash
hashInit string = withCString string $ \ptr -> do
ppHash <- malloc
botan_hash_init ppHash ptr 0
pHash <- peek ppHash
fp <- newForeignPtr botan_hash_destroy pHash
return $ Hash fp
botanHashName
:: Hash
-> IO String
botanHashName hash@(Hash fp) = withForeignPtr fp $ \pHash -> do
pSize <- malloc
botan_hash_output_length pHash pSize
reqLen <- peek pSize
pCChar <- mallocArray (fromIntegral reqLen)
botan_hash_name pHash pCChar pSize
peekCString pCChar
main = do
hash <- hashInit "SHA-384"
name <- botanHashName hash
print name
Thank you - this helped confirm what I needed to do.
I see now! This was most helpful!
Indeed I am giving them closer scrutiny after discovering that some things were incorrectly documented. Your comment has been very helpful, and you’ve expanded on your answer greatly as I was integrating the results of your initial comment with what everyone else has said - but it appears we converged something very similar regardless!
It is late, but I think I have the clarity to move forward tomorrow with the Botan.Hash
module. Thank you all so much for your very helpful pointers!
One thing I’ve seen some libraries do is to do the foreign pointers like this:
newtype Hash = Hash (ForeignPtr Hash)
I believe it was haskell-gi
that does things that way.
If someone wouldn’t mind sanity checking this for me, I want to make sure that this initialization pattern is sane and doesn’t leak:
-- typedef struct botan_foo_struct* botan_foo_t;
type OpaqueFoo = Ptr ()
newtype Foo = Foo { fooForeignPtr :: ForeignPtr OpaqueFoo }
-- int botan_foo_init(botan_foo_t* foo);
foreign import ccall unsafe botan_foo_init :: Ptr OpaqueFoo -> IO BotanErrorCode
-- int botan_foo_destroy(botan_foo_t foo);
foreign import ccall "&botan_foo_destroy" botan_foo_destroy :: FunPtr (Ptr OpaqueFoo -> IO ())
fooInit :: IO Foo
fooInit name = do
fooForeignPtr <- malloc >>= newForeignPtr botan_foo_destroy
withForeignPtr fooForeignPtr $ \ fooPtr -> do
throwBotanIfNegative_ $ botan_foo_init fooPtr
return $ Foo fooForeignPtr
My concern is that Botan
's design is to pass a pointer-pointer in to the initializer and return an success-error code, rather than return an initialized pointer. Being unfamiliar with ForeignPtr
, I want to ensure that the pointer that we malloc
for use with botan_foo_init
is itself destroyed with botan_foo_destroy
when the ForeignPtr
is garbage collected, and not just the opaque object that it was pointing to.
I think this is correct, because when I used alloca
instead of malloc
(because I was replacing ByteArray.alloc
), I got exceptions because long-lived references were having their pointers freed - but what I don’t want now is the opposite problem of leaking memory. I made the swap to malloc
after reading the docs more closely (and checking the source of ByteArray.alloc
revealed that it actually used mallocByteString
, too).
So now, I am not observing any faulty behavior, but I’d surely appreciate a sanity check from another set of eyes.
This should be
-- typedef struct botan_foo_struct* botan_foo_t;
data BotanStruct -- empty data type used only as a distinguishing tag for pointer types
type BotanPtr = Ptr BotanStruct -- optional type synonym
newtype Botan = MkBotanForeignPtr { getBotanForeignPtr :: ForeignPtr BotanStruct }
-- int botan_foo_init(botan_foo_t* foo);
foreign import ccall unsafe botan_foo_init :: BotanPtr -> IO BotanErrorCode
The type parameter of Ptr
is phantom and just helps you distinguish pointers to one type of thing from another. For opaque types you should either make a new, empty data type as I’ve done above, or use the circular newtype pattern that appeared in an earlier post newtype Botan = Mk (ForeignPtr Botan)
. I like this version less, myself, because I like having the independent type that can be used both to index Ptr
and ForeignPtr
as we alternate between the two.
You shouldn’t use malloc
because BotanStruct
won’t be Storable
because it’s opaque. You can instead use mallocBytes :: Int -> IO (Ptr a)
.
Also, don’t use newForeignPtr
until you’ve successfully initialized your struct. Presumably you don’t destroy something that isn’t initialized successfully. So first mallocBytes
, then initialize and throw on error like you did, then newForeignPtr with the destructor.
So, something rather like this?
fooInit :: IO Mac
fooInit = do
fooPtr <- mallocBytes _ -- What goes here?
throwBotanIfNegative_ $ botan_foo_init fooPtr
fooForeignPtr <- newForeignPtr botan_foo_destroy fooPtr
return $ MkFooForeignPtr fooForeignPtr
It is this bit that still has me scratching my head. I was originally using Ptr ()
and malloc
because of these two things:
Since a pointer is a pointer, we can of course cast between Ptr ()
and Ptr OpaqueStruct
(to use my initial terminology). We aren’t responsible for initializing the botan_struct
memory, we just need to hand it our own pointer that itself won’t be freed - so how many bytes do we need to allocate for the pointer itself?
Since all pointers are the same size, and malloc
is just mallocBytes (sizeOf (undefined :: a))
, we could use:
fooPtr <- mallocBytes (sizeOf (undefined :: Ptr ()))
Alternatively:
fooPtr <- castPtr <$> malloc @(Ptr ())
Am I crazy? Or just missing something obvious? Or do I have this correct?
Note that I am trying it out in with the next module (Mac
) before going back and refitting Hash
and Random
, and it actually seems to work*. For the moment, I am more concerned with whether or not there is any leaky behavior. Once the implementation is memory-safe, the proper Haskell-y abstraction can be debated at our leisure, whereas I cannot move forwards without a working implementation.
* This actually does not work with the way that this library is structured.
For example, take the following function that we must be able to run
-- int botan_mac_output_length(botan_mac_t mac, size_t* output_length);
I am not sure how to pass in the MacStruct
here. Previously, when type OpaqueMac = Ptr ()
and newtype Mac = Mac (ForeignPtr OpaqueMac)
, the double-pointer seemed to be represented properly, and I could do:
-- Note the `OpaqueMac / Ptr ()` for `botan_mac_t mac`
foreign import ccall unsafe botan_mac_output_length :: OpaqueMac -> Ptr CSize -> IO BotanErrorCode
macOutputLength :: Mac -> IO Int
macOutputLength (MkMac macForeignPtr) = withForeignPtr macForeignPtr $ \ macPtr -> do
mac <- peek macPtr
alloca $ \ szPtr -> do
throwBotanIfNegative_ $ botan_mac_output_length mac szPtr
fromIntegral <$> peek szPtr
However, with the new style using the uninhabited type, I cannot actually peek
because it is not Storable
(obviously). Instead, I have to do:
-- Note the `MacPtr` for `botan_mac_t mac`
foreign import ccall unsafe botan_mac_output_length :: MacPtr -> Ptr CSize -> IO BotanErrorCode
macOutputLength :: Mac -> IO Int
macOutputLength (MkMac macForeignPtr) = withForeignPtr macForeignPtr $ \ macPtr -> do
alloca $ \ szPtr -> do
throwBotanIfNegative_ $ botan_mac_output_length macPtr szPtr
fromIntegral <$> peek szPtr
But I feel like botan_mac_t mac
(which matched OpaqueMac
) should match MacStruct
rather than MacPtr
. Or is the double-pointer throwing me off again?
Here’s the cleaned up version
module Help where
import Data.Word (Word32)
import Foreign.C (CString, CInt(..), withCString)
import Foreign.ForeignPtr (ForeignPtr, newForeignPtr, FinalizerPtr)
import Foreign.Marshal.Alloc (alloca)
import Foreign.Ptr (Ptr, FunPtr)
import Foreign.Storable (peek)
data BotanHashStruct
-- int botan_hash_init(botan_hash_t* hash, const char* hash_name, uint32_t flags)
foreign import ccall unsafe botan_hash_init :: Ptr (Ptr BotanHashStruct) -> CString -> Word32 -> IO CInt
-- int botan_hash_destroy(botan_hash_t hash)
foreign import ccall "&botan_hash_destroy" botan_hash_destroy_funptr :: FinalizerPtr BotanHashStruct
-- this is a cheat because it ignores the 'int' return on hash_destroy
newtype BotanHash = BotanHash (ForeignPtr BotanHashStruct)
throwBotan :: CInt -> IO a
throwBotan x = fail ("Botan error: " ++ show x) -- do something better here
newBotanHash :: String -> IO BotanHash
newBotanHash name =
alloca $ \outPtr ->
withCString name $ \namePtr ->
botan_hash_init outPtr namePtr 0 >>= \ret ->
if 0 /= ret then throwBotan ret else
peek outPtr >>= \out ->
newForeignPtr botan_hash_destroy_funptr out >>= \fp ->
pure (BotanHash fp)
I can post the Hash
which uses the old style:
-- NOTE: I've elided irrelevant arguments
-- typedef struct botan_hash_struct* botan_hash_t;
type OpaqueHash = Ptr ()
newtype Hash = Hash { hashForeignPtr :: ForeignPtr OpaqueHash }
-- int botan_hash_init(botan_hash_t* hash, ...);
foreign import ccall unsafe botan_hash_init :: Ptr OpaqueHash -> ... -> IO BotanErrorCode
foreign import ccall "&botan_hash_destroy" botan_hash_destroy :: FunPtr (Ptr OpaqueHash -> IO ())
hashInit :: ByteString -> IO Hash
hashInit name = do
hashForeignPtr <- malloc >>= newForeignPtr botan_hash_destroy
withForeignPtr hashForeignPtr $ \ hashPtr -> do
throwBotanIfNegative_ $ botan_hash_init hashPtr ...
return $ Hash hashForeignPtr
-- int botan_hash_output_length(botan_hash_t hash, size_t* output_length);
foreign import ccall unsafe botan_hash_output_length :: OpaqueHash -> Ptr CSize -> IO BotanErrorCode
hashOutputLength :: Hash -> IO Int
hashOutputLength (Hash hashForeignPtr) = withForeignPtr hashForeignPtr $ \ hashPtr -> do
hash <- peek hashPtr
alloca $ \ szPtr -> do
throwBotanIfNegative_ $ botan_hash_output_length hash szPtr
fromIntegral <$> peek szPtr
You can see the hash <- peek hashPtr
, which is a similar scenario to the issue mentioned.
I shall give this a try momentarily - right now I need some lunch
Despite any confusion, all of this has been wonderfully illustrative as to the proper internal workings of Haskell FFI. Taking the time to do it right, now, will save us from pain, later.
Ah, yes, you see it too. I havent figured out a way around this one - I’m not even sure where throwBotanIfNegative
would throw to, even if we could (or what our response would be)!
back from lunch
If I could give this a second heart, I would. This has been a most insightful discussion. I believe this answers all of the minor quibbles and questions that I had previously regarding safe and proper initialization. Furthermore, it is more ergonomic, effectively moving the peek
from all of the accessory functions into the initializer.
I have additionally implemented one of the accessory functions, to sanity check our definitions, and it works out nicely:
Barring any unforeseen questions, I will be retrofitting the other modules with these learnings.
I’m back with another headscratcher.
I have the following FFI function that is expecting an array of pointers to multiple buffers of known size as input. Simplifying to CStrings, It has a declaration similar to the following:
-- int use_multiple_cstrings(cchar *const*const inputs, size_t numBuffers, ...);
foreign import ccall unsafe use_multiple_cstrings :: Ptr (Ptr CChar) -> CSize -> ... -> IO BotanErrorCode
If we knew the number of buffers ahead of time (eg, 3), we could just write it out:
useMultipleCStrings :: ByteString -> ByteString -> ByteString -> IO ()
useMultipleCStrings a b c = do
useAsCString a $ \ aPtr -> do
useAsCString b $ \ bPtr -> do
useAsCString c $ \ cPtr -> do
allocaArray 3 $ \ (ptrArrayPtr :: Ptr (Ptr CChar)) -> do
let ptrs = [aPtr,bPtr,cPtr]
pokeArray ptrArrayPtr ptrs
throwBotanIfNegative_ $ use_multiple_cstrings ptrArrayPtr 3
However, the number of buffers / cstrings is not known ahead of time. I could just unsafely reach in to grab the pointers eg, unsafeUseAsCString return
, but that’s dangerous (what if the bytestring gets move, etc).
Is there an idiomatic way of doing this safely? I feel like I should be able to do something with join
or foldr
or traverse
with useAsCString
to get useAsCStringArray :: [ByteString] -> ([Ptr CChar] -> IO a) -> IO a
, but it eludes me for the moment.
I’ve whipped up an interim solution which works.
withPtrs :: (forall a . typ -> (ptr -> IO a) -> IO a) -> [typ] -> ([ptr] -> IO b) -> IO b
withPtrs withPtr [] act = act []
withPtrs withPtr (typ:typs) act = withPtr typ $ \ typPtr -> withPtrs withPtr typs (act . (typPtr:))
It seems to work fine:
> withPtrs withCString ["foo","bar","qux"] (print <=< mapM peekCString)
["foo","bar","qux"]
I still think it could be done with foldr
, maybe I’ll return to this when I feel like playing code golf again.
I honestly think your solution seems fine? Unless I’m missing some weakness it has.
It’s mostly curiosity now that there’s a solution
unsafeUseAsCString isn’t unsafe because of GC-related movement, but you can read about what is unsafe about it in the docs.
When I needed to work with a lot of this kind of “withPtr” FFI, I made a newtype for it.
NestedIO is for IO stuff that needs to operate in these nested “with” contexts.
This newtype allows me to implement a wrapper for any of the “withPtr”-style functions with nest1 :: (forall r. (a -> IO r) -> IO r) -> NestedIO a
With this I can take withText :: Text -> NestedIO FgnStringLen
and then use traverse
to do a version of what you’re asking about:
traverse withText :: [Text] -> NestedIO [FgnStringLen]
Then you can see this in a larger example like in withRawIrcMsg where a great many “withPtr” like operations are all chained together.
Indeed - that is why I avoided unsafely reaching in to grab the pointer via unsafeUseAsCString return
- I probably should have specified ‘collected’ instead of ‘move(d)’. (Does getting garbage-collected count as moving?)
This is most excellent - I already have a WithPtr
type-alias, which is but a step away from it:
type WithPtr typ ptr = (forall a . typ -> (ptr -> IO a) -> IO a)
I shall definitely have to give some thought towards refining things further. I especially like the trick with traverse
The pointer can’t be collected when using it with that operation. That’s not how it is unsafe. It’s the same mechanism as withPtr or alloca.
That was a small joke in reference to my linguistic faux pas, equivalent to calling deleting a file “moving it to /dev/null”, that sort of thing. I understand how unsafeUseAsCString
is unsafe - it allows breaking referential transparency (editing the bytes edits the bytestring and any other sharing the same backing memory) and the pointer’s lifetime is scoped to wrapped action (so don’t use the pointer after).
Apologies for any confusion caused.
I prefer this to having a newtype over Ptr ()
(that would be void* and you can pull it out thin air) or Ptr Uninhabited
.
It protects the pointer, removing the pointer details from public API while keeping it in one piece, without splitting into two or three distinct entities only to recombine them back again.
This is how I solved a similar problem in my libtelnet
binding:
-- | Collect '[ByteString]' into a temporary array of strings in a
-- 'Ptr CString', for passing to C functions.
useAsCStrings :: [ByteString] -> (Ptr CString -> IO a) -> IO a
useAsCStrings list f = go list [] where
go [] css = withArray (reverse css) f
go (bs:bss) css = B.useAsCString bs $ \cs -> go bss (cs:css)