Using unsafePerformIO safely?

was that meant to be ā€œcalling foreign functions requires IOā€ ? Or are you referring to this comment ā€¦

Neither, I donā€™t think; what I mean is that literally any action at all requires physics; i.e. some interaction with the outside world.

To call a function the computer needs to perform actions in the real world; a disk needs to spin, a memory address must be read, etc, etc. Weā€™ve decided that we donā€™t want to capture this level of detail in our type system (thankfully!), but what I mean is that itā€™s totally arbitrary. And, I think, this thread is about pushing the boundaries of that idea; i.e. how can we (lowly) programmers control that boundary? I.e. merely because I donā€™t have an SMT solver in Haskell Iā€™m forced to think about that boundary; and thatā€™s, conceptually, super frustrating! Thereā€™s no in principle reason for this abstraction to be broken at this point, in my view (and this is how I interpret the thread/question).

3 Likes

So whatā€™s the alternative - shoving the entire SMT-solver framework inside a Haskell implementation just to have it available as primitive, but otherwise regular Haskell functions?

Like it or not, programming languages must have some way to access legacy code: considering how many new ones continue to appear year after year, it is utterly infeasible for us programmers to rewrite all that legacy code to suit each and every language. Maybe at the other end of this century (or millennium) matters will be easier, but until then weā€™ll just have to deal with it as best we can, with what we currently have.

1 Like

So whatā€™s the alternative - shoving the entire SMT-solver framework inside a Haskell implementation just to have it available as primitive, but otherwise regular Haskell functions?

Certainly not! Thatā€™s the entire point of the thread - there has to be a better alternative than that? Merely to call a function that we know is ā€œpureā€ but just inconveniently happens to live just outside our abstraction boundary. The point is, how we can be bring it inside safely, without this annoying abstraction-breaking burden of a monad.

2 Likes

No, and I think @silky is defending exactly that and shares the same frustration as I: we need to come up with better answers about how to interact with the world. The blanket answer of ā€œjust push the IO monad through the entire codebaseā€ is not great for a number of reasons that were already outline higher up in this thread.

1 Like

Well, discussing alternative approaches to I/O (or dealing with state) is probably best left to a new threadā€¦are there any volunteers?

I think unsafePerformIO might actually be a pretty good alternative, we just need to understand how to use it safely, which was what motivated my original question to @rae :slight_smile:

I got a lot out of this thread and Iā€™ll read and study all of the material, so thank you to everyone that gave me pointers and suggestions!

1 Like

ā€¦is a draft version of e.g. a guide now publicly available?

The Haskell redis driver hedis makes a brilliant use of UnsafeInterleaveIO. I cannot imagine an STM implemention able to compete with it, performance-wise.

Itā€™s not quite clear to me why that canā€™t just be done in IO.

Letā€™s refactor @Nycticoraxā€™s example - if Iā€™m reading it correctly:

connGetReplies :: Connection -> IO [Reply]
connGetReplies conn = go S.empty (SingleLine "previous of first")
  where
    go rest previous = do
      -- lazy pattern match to actually delay the receiving
      ~(r, rest') <- unsafeInterleaveIO $ 
        -- Force previous reply for correct order.
        previous `seq` connGetReply conn rest
      rs <- unsafeInterleaveIO (go rest' r)
      return (r:rs)

connGetReply ::  Connection -> S.ByteString -> IO (Reply, S.ByteString)
connGetReply conn@Conn{..} rest =
     do scanResult <- Scanner.scanWith readMore reply rest
        case scanResult of
          Scanner.Fail{}       -> CC.errConnClosed
          Scanner.More{}    -> error "Hedis: parseWith returned Partial"
          Scanner.Done rest' r -> do
            -- r is the same as 'head' of 'connPending'. Since we just
            -- received r, we remove it from the pending list.
            atomicModifyIORef' connPending $ \(_:rs) -> (rs, ())
            -- We now expect one less reply from Redis. We don't count to
            -- negative, which would otherwise occur during pubsub.
            atomicModifyIORef' connPendingCnt $ \n -> (max 0 (n-1), ())
            return (r, rest')

With the comments removed:

connGetReplies :: Connection -> IO [Reply]
connGetReplies conn = go S.empty (SingleLine "previous of first")
  where
    go rest previous = do
      ~(r, rest') <- unsafeInterleaveIO $ previous `seq` connGetReply conn rest
      rs <- unsafeInterleaveIO $ go rest' r
      return (r:rs)

ā€¦connGetReplies has an interesting resemblance to Data.Supply.newSupply:

{-# INLINE newSupply #-}
newSupply :: a -> (a -> a) -> IO (Supply a)
newSupply start next = gen =<< newIORef start
  where gen r = unsafeInterleaveIO
              $ do v  <- unsafeInterleaveIO (atomicModifyIORef r upd)
                   ls <- gen r
                   rs <- gen r
                   return (Node v ls rs)
        upd a = let b = next a in seq b (b, a)

in particular the dual use of unsafeInterleaveIO. Another similar definition is Control.Concurrent.getChanContents:

getChanContents :: Chan a -> IO [a]
getChanContents ch
  = unsafeInterleaveIO (do
        x  <- readChan ch
        xs <- getChanContents ch
        return (x:xs)
    )

Using getChanContents as the example, if unsafeInterleaveIO was removed:

getChanContents ch = do
        x  <- readChan ch
        xs <- getChanContents ch
        return (x:xs)

ā€¦the result (x:xs) would never be returned - it may as well be defined as:

getChanContents ch = sequence $ repeat $ readChan ch

At least one of two things would run out - memory (stack overflow or heap exhaustion) or patience!

Whether unsafeInterleaveIO can be avoided by using e.g. interatees is left as an exerciseā€¦

Sorry @atravers I didnā€™t quite get the intent; so you agree that itā€™s a useful/legit use case in this context, donā€™t you?

I agree with this, but Iā€™m not certain that itā€™s enough. The problem here is that words like ā€œside effectā€ imply some kind of abstraction boundary between things that the program should care about, and things that it should not. Without that abstraction boundary, nothing can ever satisfy the conditions to be safe.

In this example, invoking an SMT solver can definitely have side effects if other parts of the program look at the list of running processes to decide whether the SMT solver is running. Does that matter? Most of the time, probably notā€¦ but it certainly does matter if youā€™re using the SMT solver as part of a program that enforces a policy on how many processes a user is allowed to run!

For this reason, I reach the conclusion that Richardā€™s hypothetical ProofOfSafety is fundamentally impossible to define as part of a general-purpose library, because it relies on decisions about how to draw the right abstraction boundary, and that decision varies from program to program.

Note that the side effects must be observable from pure Haskell code. You cannot access the list of running processes that way.

Inside IO you can monitor everything that happens on the computer, so otherwise everything would be a side effect. Even 1 + 1 would have the side effect of taking time to compute and running instructions on your CPU, which you can observe inside IO.

Huh. So even the oft-quoted launching of the missiles, the most famous example of a side effect, isnā€™t a side effect in your view? This seems far too restrictive a notion of side effects. Now basically anything, no matter how unsafe, qualifies as technically ā€œsafeā€ because you cannot observe it from pure code.

Yes, this is basically what I said. If you fail to draw an abstraction boundary between internals and exposed behavior, then there is no such thing as safety. In practice, we donā€™t define that boundary formally. We just all sort of know that the time it takes to compute a pure expression isnā€™t thought of as a side effect, whereas launching missiles typically (but I guess not always!) is. If you want to prove anything, youā€™d need to be quite explicit about the boundary.

I guess so. I would consider anything that canā€™t come back to influence the actual evaluation is not really a side effect (that we should care about).

Consider (what feels to me like) the dual: reading something from the ā€œoutside worldā€ that is a static value, i.e. not influenced by anything else in the environment. I believe that is a common use case of unsafePerformIO in practice, e.g. Control.Concurrent.rtsSupportsBoundThreads (this actually uses the FFI and not unsafePerformIO directly).

I was just providing some examples of where ā€œregularā€ monadic I/O (which is strict) is too restrictive - then again, perhaps I didnā€™t quite get the intent of @tomjaguarpawā€™s remarkā€¦

so you agree that itā€™s a useful/legit use case in this context, donā€™t you?

In this context (and in the absence of a practical alternative): yes - as much as lazy I/O is despised, sometimes you need that laziness. Yes, such a mechanism has its problems, but then so does regular I/O e.g:

import System.Process(getCurrentPid, callCommand)

main = do pid <- getCurrentPid
          callCommand $ "kill " ++ show (toInteger pid)

Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots.

So far, the Universe is winning.

Rich Cook.

ā€¦unfortunately our reality isnā€™t quite so ā€œneat and tidyā€ - a Roman candle could go off-course and land where the computer running the Haskell program in control of the fireworks is located. Even reading in a static value can have an effect e.g. changing the access time of a file.

To me, this topic is just another aspect of I/O, and it continues to confound:

  • otherwise there wouldnā€™t have been calls for more "semantically simple" alternatives

  • ā€¦and this thread wouldnā€™t have attracted so much attention :-)

Getting back on-topic: a guide for the safe safer use of unsafe entities (even if it is a draft version) is probably the best that we can do for now, hence the inquiry. It would surely be a small improvement of the current situation where such advice, if it exists, is scattered throughout cyberspaceā€¦

1 Like

Nice, I completely agree.

tl;dr: Basically here they are making a cheap, thread-safe ā€œqueueā€ of IO Reply actions without the need for a ā€œcanonicalā€ queue-like data structures (Chan, TQueueā€¦) I have not benchmarked this but I can only assume it blows these alternatives out of the water.

Luckily, you canā€™t observe roman candles destroying your computer or file systems changing access times from within pure code.

Since pure code also cannot observe the side effects of their use, wouldnā€™t that imply unsafe I/O entities can be used anywhere?