FFI: Is there any efficient way to implement a haskell wrapper as a "multi-shot" callback for c++?

Hi, suppose I have an async c++ function like this:

void setRecvCallback(Reader* reader, std::function<void(std::string&&)> callback);

I want to call this in my haskell code. And I know the usual way is to create a foreign export and pass the FunPtr as the callback to the c++ ffi. However, this seems inefficient.

Besides, I know a better way is to use hs_try_putmvar as described in the ghc user guide. But this way only works if the callback is “one-shot”, which means only trigger one time.

So, my question is: if the callback will be called multiple times by a cpp worker, is there any efficient way to implement the haskell wrapper instead of using the usual FunPtr way?

Thanks

I don’t see this mentioned in the GHC documentation. I guess a problem could be that the Haskell thread might not have enough time to clear the MVar and hence messages may get lost. Is that what you mean?

Another option is perhaps to keep some kind of queue on the C/C++ side. And only use the MVar to signal that something is in that queue, not to transfer the actual data. Then it doesn’t matter if successive events get lost. And the Haskell thread can come around later and use a different (possibly unsafe) FFI function to empty the queue.

1 Like

If you’re concerned about efficiency, just write the callback in C++ instead of Haskell, and apply it via an auxiliary C++ definition which is then called by Haskell:

extern void setRecvCallback(Reader* reader, std::function<void(std::string&&)> callback);

// you write this
std::function<void(std::string&&)> myRecvCallback /* ... */

// Haskell calls this
void setMyRecv(Reader* reader) {
    setRecvCallback(reader, myRecvCallback);
}

If we want to wake up Haskell threads, we must call hs_try_putmvar, and

The Haskell allocated StablePtr is freed by hs_try_putmvar()

Which means when the callback is called the second time, it will use an already freed HsStablePtr, which is a terrible thing and will cause a coredump.


Using an async queue seems a good idea. However, in the real world, things will become complex. Let’s say I have a multi-threaded CPP TCP server, I want to write a Haskell handler to handle TCP streaming io. Do I need to allocate a queue for each request? And how can Haskell know the address of the queue for the request? (There may also be other issues)

Thanks for your reply. However, the base thing I want to do is let Haskell do the callback. (Imagine a C++ TCP server with a Haskell handler.)

Let’s see what can be done using C:

# cat C_Side.c
void thisUsesACallback(char *msg, void (*callme)(char *)) {
       (*callme)(msg);
}

# cat Hs_Side.hs
{-# LANGUAGE Haskell2010 #-}
import Foreign
import Foreign.C.String

main = do cbk <- mkCallback myCallback
          str <- newCString "it actually worked!"
          thisUsesACallback str cbk
          free str
          freeHaskellFunPtr cbk

myCallback :: CString -> IO ()
myCallback p = do str <- peekCString p
                  putStrLn ("myCallback: " ++ str)

foreign import ccall
    thisUsesACallback :: CString -> FunPtr Callback -> IO ()

type Callback = CString -> IO ()

foreign import ccall "wrapper"
    mkCallback :: Callback -> IO (FunPtr Callback)

# gcc -c C_Side.c
# ghc -c Hs_Side.hs
# ghc -o testme Hs_Side.o C_Side.o
# ./testme
myCallback: it actually worked!
#

So for C, you don’t need a foreign export. Making something like this work in C++ is left as an exercise…

Yes. Sorry for the inaccurate description.

And I know the usual way is to create a foreign export and pass the FunPtr as the callback to the C++ FFI

I don’t need a foreign export, instead, I’d need a foreign wrapper actually.

Here is a simplified version of my actual code (pseudocode):

// cpp side
using HsCallback = void (*)(server_request_t*, server_response_t*);

struct Handler {
  HsCallback& callback;

  asio::awaitable<void> operator() {
    server_request_t request;
    server_response_t response;
    co_await async_get_request(&request);
    (*callback)(&request, &response);
    ...
  }
}

void run_async_cpp_server(HsCallback callback) {
  async_run(Handler{callback});  // block
}

// haskell side
type ProcessorCallback = Ptr Request -> Ptr Response -> IO ()

foreign import ccall "wrapper"
  mkProcessorCallback :: ProcessorCallback -> IO (FunPtr ProcessorCallback)

foreign import ccall safe "run_async_cpp_server"
  run_async_cpp_server :: FunPtr ProcessorCallback -> IO ()

So, my question is, can we do better? can I only have some unsafe ffi like the way through hs_try_putmvar to achieve this? Since unsafe ffi is more efficient than all else.

No - if there was a “more efficient” and trustworthy way to implement GHC’s FFI, it would already be in use. You only have to look at the libraries GHC provides by default (without bringing in extra packages) to see how important the FFI is, and not just in terms of efficiency.

As for hs_try_putmvar, the documentation you referred to makes it quite clear that it’s a considerably-specialised variant of tryPutMVar: it’s an “apples-and-oranges” comparison.