it can impede sharing, for one!
I know when I want to use unsafePerformIO
at least - one might perform a pure computation (calling C functions) that allocates/frees memory.
I had a tidbit on The Interesting Part of Monadic Effects
Iâm not sure how it works under the hood though - so I canât justify why this whole affair works. But my intuition around unsafePerformIO
at least has always worked!
But this interface should be robust to multi-threading (at least).
How so? You could certainly solve something in one thread and then pass its result to another I imagine. You could probably even use unsafeDupablePerformIO
there if you didnât call the SMT solver twice on the same inputs (as I understand it).
âŚwhich is more-or-less the âsafe-use requirementsâ for unsafePerformIO
unsafeLocalState
- see chapter 30 of the Haskell 2010 Report.
Iâm just wondering: is there a library for this particular SMT framework? If so, perhaps primitive imports (one of GHCâs FFI extensions) could access the library directly, using a non-IO
type.
However in either case, foreign code is being used, and to quote one M. Chakravarty:
After all, you can import any C function with a pure type, which also allows you to wreak arbitrary havoc. We enable the user to disguise arbitrary machine code as a Haskell function of essentially arbitrary type. In comparison,
unsafePerformIO
[orunsafeLocalState
] seems angelic.
If anyone can make that arrangement robust to multi-threading, etc: well done!
Acquiring an MVar
is a global effect in the program. Doing this in the context of IO is much easier to understand than in the execution of pure code. What does it matter if the runtime system chooses to stop executing some pure code? If it makes such a choice after acquiring the MVar
it prevents any other thread from acquiring the MVar
.
Taking an MVar
is only impure if the MVar
is exposed to the rest of the program. That is not the case here. In this case the only thing that can happen is that a function takes longer to run (assuming that the MVar
does eventually get released), but that is not considered a side effect because pure programs cannot observe time.
Partial application exposes an invocation of withMVar
. This is the reason for the MVar
in the first place.
Regarding the specification of unsafePerformIO
:
I think the documentation of
unsafePerformIO
is clear enough to answer this, specifically:For this to be safe, the
IO
computation should be free of side effects and independent of its environment.
I, too, think this specification is exact. It also gives compiler optimisations the greatest leeway to compile pure code efficiently and is the reason why Haskell can execute efficiently in the first place.
The beauty is that the specification of unsafePerformIO
in f (unsafePerformIO e)
only relates to e
; it doesnât assume anything about the function/environment/context f
which is called with an argument that uses unsafePerformIO
. If the specification would allow some premise about all such f
, then certain compiler optimisations (like strictness analysis) would have to prove that f
is never called with an argument that uses unsafePerformIO
to unbox f
's argument, say. This would amount to saying that the optimiser must assume that Haskell is lazy, (non-total,) and impure, which is horrible to work with.
IMO, that is why the docs of unsafePerformIO
are written as they are. Many use cases, like those in bytestring
, fall into this fragment: After the âIO transactionâ e
finishes, the intermittent side-effects are no longer observable.
Regarding supposedly safe uses of unsafePerformIO
that fall outside the specification:
Given enough confidence of the programmer, deferring and interleaving side-effects with unsafeInterleaveIO
to achieve something like value-supply: A library for generating values without having to thread state. (which @atravers linked and which is also used in GHC) is possible if you make sure that all environments indeed canât observe the side-effects. That is often done by making very judicious use of supposedly strictly internal APIs like GHC.Types.Unique which break the abstraction. Properly locking away these APIs to âenvironmentsâ that we control and where we can make sure the optimiser doesnât interfere (by making use of NOINLINE etc.) means that the specification of unsafePerformIO
applies again. I would say this is quite similar to encapsulation in the OO sense, where an export like nonDetCmpUnique
would be considered bad practice to uphold the abstraction.
In terms of GHC, failure to do so means non-deterministic builds (in terms of binary hashes) and the resulting loss of incremental rebuilds. Despite years of ongoing efforts, I still feel like bugs still hide because of use of unsafe abstractions like that, because I sometimes observe more rebuilding than I feel is necessary.
Why is it so difficult to fix these bugs? Because we actually use the Int#
identity of GHCâs Unique
for implementing efficient maps in terms of IntMap
. And now suddenly the whole, pure API (well, at least half) of IntMap
has become non-deterministic with no IO
to warn us! See this Note in the compiler: compiler/GHC/Types/Unique.hs ¡ 706deee0524ca6af26c8b8d5cff17a6e401a2c18 ¡ Glasgow Haskell Compiler / GHC ¡ GitLab Our solution so far is to justify each use of a non-deterministic function (which all have a nonDet*
prefix) with a comment like It's OK to use nonDetUFMToList here because ...
which acts as our âproofâ in the sense of the OP.
Without Ord Unique
(which is non-deterministic), we could only use O(n^2) assocation lists for Unique/Name lookup, which is terrible. Non-determinism as a direct result of our unsafe use of unsafePerformIO
is just horrible, but we get to use O(n) IntMap
s. So Iâd call our solution horribly efficient
âŚfirst unsafe
, then nonDet
- appearing soon: unknown
.
-
Having praised monads to the hilt, let me level one criticism. Monads tend to be an all-or-nothing proposition. If you discover that you need interaction deep within your program, you must rewrite that segment to use a monad.
Philip Wadler (page 29 of 33).
-
Once youâre in the
IO
monad, youâre stuck there forever, and are reduced to Algol-style imperative programming. You cannot easily convert between functional and monadic style without a radical restructuring of code.
One alternative would be to replace the prefixes with types - then entities like these:
unsafeLocalState :: IO a -> a
nonDetUFMToList :: UniqFM key elt -> [(Unique, elt)]
would require an extra argument:
localState :: Unsafe# -> IO a -> a
ufmToList :: NonDet# -> UniqFM key elt -> [(Unique, elt)]
where Unsafe#
and NonDet#
are abstract, like GHCâs RealWorld
. Since itâs IO
model is already nondeterministic and can be unsafe e.g. using the FFI to call âpoorly-implementedâ foreign entities:
unsafely :: IO Unsafe#
nondetermined :: IO NonDet#
âŚso no extra harm done to (whatâs left of) the semantics attributed to Haskell.
Yes, the need for extra parameters would probably be annoying, but if it saves having to âmonadifyâ vast segments of existing code (or even a change of paradigm)âŚ
Uh, do we assume that pure style is strictly better than monadic style?
Here are my points against monadic style compared to pure style:
- Purely syntactically, I think it is clear that
f x
is objectively better thanf =<< x
. - Monadic style clutters code with irrelevant sequencing information.
- Monads add implicit information which must be considered when reasoning about functions. Just considering the arguments is no longer sufficient. In particular, almost anything can happen in the IO monad.
A couple of counterpoints:
- it is no easier (in principle) to reason about pure code than code in the
Identity
monad (because they are the same thing) - it is often easier (in practice) to reason about code written in the
State
monad than the equivalent pure code that hand-threads the state parameter (because the monad captures much of the book keeping you would otherwise have to do by hand)
So I donât think itâs clear that monadic code always hinders reasoning.
In fact I would say itâs generally easiest to reason about code that is written in the âappropriate monadâ, neither âtoo much monadicâ (IORef
s in IO
) nor âtoo little monadicâ (manually threading state parameters) but âjust right monadicâ (State
).
I donât think so. The only thing you can do with the result of that partial application is apply it again. You canât scrutinize the code.
That spec is not exact. At best itâs over, approximating. Arguably, the only environment-independent computations are those of the form f <$> pure x
; which are those that donât need to be in IO
in the first place.
Just to be clear, what Iâm trying to argue is that there are many legitimate uses of unsafePerformIO
for running computations that albeit not being env-independen or side-effect free (whatever that even means), are still perfectly safe to be used.
That starts to be something much more satisfactory that we might have a chance at formalizing and understanding. In fact, its not too far from the ideas proposed by Schmidt-SchauĂ on the papers that @atravers suggested.
I think what is central to your distinction is that Identity
and State s
are honest monads â i.e., can be interpreted in a suitable mathematical domain with a suitable equality and can be proven to satisfy the monad laws â IO
is not a monad in that sense. For one, what should the suitable equality be for IO
? This makes it effectively impossible to formally reason about it.
I side with @jaror and I prefer pure code. Actually, just last week we got bitten by monadic code when a ListT
was introduced into the stack, but we forgot that it comes with a MonadFail
instance that simply returns the emtpy list. We did not have an easy time finding this bug. We needed the transformer version because we wanted to avoid unsafePerformIO
and needed to thread a (MonadIO m)
around.
Exactly, while A
is evaluating it the first time (the MVar
is taken) B
applies it again, blocking on the MVar
. For B
to complete, A
must finish. What guarantees do you have that a particular execution of pure code will run to completion? Will it remain live? Will it be GCed? Will the MVar
be freed if it is GCed?
Oh, I see what you mean. In this case I think its indeed alright: the MVar
used with withMVar
, which is an exception-safe function, so even if A
fails, the MVar
would be released. I also think that the GC must not collect a computation whose result is still being computed. Are there counterexamples to this?
In this particular example, the risk is the SMT looping. But having IO
all over the code wouldnât save you either. So, I guess unsafePerformIO
is as safe as explicit IO
.
Regardless, Iâd like to reiterate that that function with the MVar
was just an example. Documenting all of these aspects and what the questions one must ask themselves to justify their use of unsafePerformIO
is exactlty what my original question was all about.
I donât have time to work up a test case right now, but, as far as I recall, the warnings listed in unsafeIOToSTM
apply to unsafePerformIO
in pure code. There may of course be other or future forms of speculative execution. It is not hard (modulo memory use) to reason about what happens in pure code execution. If it is becoming hard to reason about, perhaps âpurely functionalâ is the wrong abstraction.
unsafeIOToSTM
is much worse than unsafePerformIO
in that STM does things that donât happen in IO otherwise â in particular the retry mechanism can completely circumvent exception handling and safety. So if you have a bracket
call, the resource may be acquired, stm may interrupt the computation with a retry, and the resource will never be released.
On the other hand, using unsafePerformIO
in pure code, I think we can always assume that bracket
works as intended.
Uhm, isnât ListT considered bad and thus be avoided at all costs?
From page 17 of 51 in Launchbury and Peyton-Joneâs State in Haskell:
dfs g vs = runST (
newArr (bounds g) False `thenST` \ marks ->
search marks vs
)
where search :: MutArr s Vertex Bool -> [Vertex]
-> ST s [Tree Vertex]
search marks [] = returnST []
search marks (v:vs) = readArr marks v `thenST` \ visited ->
if visited then
search marks vs
else
writeArr marks v True `thenST_`
search marks (g!v) `thenST` \ ts ->
search marks vs `thenST` \ us ->
returnST ((Node v ts): us)
where:
type Graph = Array Vertex [Vertex]
data Tree a = Node a [Tree a]
Hereâs one way to rewrite that original code for dfs
:
dfs g vs = unsafePerformIO (
newIOArray (bounds g) False >>= \ marks ->
search marks vs
)
where search :: IOArray Vertex Bool -> [Vertex]
-> IO [Tree Vertex]
search marks [] = return []
search marks (v:vs) = readIOArray marks v >>= \ visited ->
if visited then
search marks vs
else
writeIOArray marks v True >>
search marks (g!v) >>= \ ts ->
search marks vs >>= \ us ->
return ((Node v ts): us)
Even though the use of side effects in both definitions is basically the same, the IO
version would be a violation of the given rule (it not being âfree of side effectsâ) - in this case the presence of side effects is of no consequence, as they cannot be observed outside the scope of dfs
.
I guess this again turns into a deeper discussion of what side effects really are. Letâs take for example the current Wikipedia definition of side effects which seems reasonable to me:
In computer science, an operation, function or expression is said to have a side effect if it modifies some state variable value(s) outside its local environment, that is to say has an observable effect besides returning a value (the primary effect) to the invoker of the operation. [emphasis added]
In your example the state is not modified outside its local environment and the effects are not observable by the caller, so I would say that there are no side effects in that code (at least when taken as a whole).
Even though the use of side effects in both definitions is basically the same, the
IO
version would be a violation of the given rule (it not being âfree of side effectsâ)
I rather have interpreted the âspecâ in the sense that âafter the IO action concludes (atomically), no side-effects must be visibleâ rather than âthe IO action must not perform any side-effects while executingâ. And the former clearly is the case for the IO version of dfs
, so to me the use of unsafePerformIO
is safe. The latter interpretation is much less useful.
I concur that there is too much room for interpretation in that comment.