Is the IO type built-in?

When I learned Haskell, I’ve learned that the type constructor IO is “built-in” to the language, i.e., has to be implemented in the compiler and/or runtime themselves, rather than (like — say — Maybe (docs, source code)) in a library that ships with the compiler (and may be imported by default) but is otherwise treated just like my own Haskell code.

Though, is that actually true?

The documentation for Prelude.IO doesn’t mention its definition, but links to source code (the same that the documentation for GHC.Base.IO links to) with this definition:

newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))

Thus a thin wrapper around what seems to be a particular function type.

And sure enough, we can find the same code in

Is that GHC’s actual implementation of the IO type (thus making it a type defined in Haskell rather than built-in, I’d presume) or is that definition just there to make tooling (Haddock, language server, IDEs, etc.) happy?

(I’m aware that the logic to actually perform IO a actions has to be built-in and that RealWorld and State# probably are. And of course that IO is treated special by the fact that an application’s entry point Main.main is required to be of type IO a. So even if IO were a non-built-in library type, it’d still not be something that could be replaced or replicated by a plain old userland-defined type.)

1 Like

After all, the source code linked for them in the respective documentation says

{-
This is a generated file (generated by genprimopcode).
It is not code to actually be used. Its only purpose is to be
consumed by haddock.
-}

at the top of the file and is indeed rather different than the code actually in the project’s repo:

The Haskell 2010 report uses another term:

That’s how it’s implemented in MicroHs:

As for GHC, it’s use of a elaborated declaration allows more program transformations to optimise even I/O expressions: see State in Haskell (1995) for more details.

2 Likes

The Haskell language spec defines IO as an abstract type with a certain interface. Different Haskell implementations may implement this type differently, but there always is an implementation supported by the runtime system. GHC chose an implementation in terms of features that are not standard Haskell, but still GHC Haskell (unboxed tuples and types represented by nothing at all at runtime). In doing so, it does not need to maintain an implementation of IO in addition to those features.

3 Likes

It is the actual type (in GHC – as @sgraf points out, other implementations need not do it that way).

It’s debatable. If you, like me, see Haskell as a language where in principle functions can be impure, but in practice any impure function has its return type wrapped in IO (thus preserving referential transparency) then it makes sense

3 Likes

Ah, right, should have phrased that part differently in my question. I’ve now edited my thread-starting post regarding that.

Is there anything particularly special about IO being “abstract” or is that just a normal use of Haskell module exports and the resulting visibilities, used here to ensure that

so that these [value-]constructors can’t be pattern-matched away and so that userland IO actions can only be defined by building upon other pre-existing IO actions?

Huh, I wasn’t aware that data declarations don’t need a right-hand side.

Is there anything particularly special about IO a being “abstract” […]

Yes: to help ensure I/O actions are used safely, by not discarding or duplicating the I/O state (that’s why certain “extra” imports are required to do the more “fascinating” activities). GHC’s declaration then does rely on the normal use of Haskell module’s exports to prevent the IO data constructor:

IO :: (State# RealWorld -> (# State# RealWorld, a #)) -> IO a

from being errantly used.

An alternative approach (using GHC extensions) relies on annotations to restrict the use of the I/O state:

newtype IO a = IO (State# RealWorld %1-> (# State# RealWorld, a #))

(with the %1 preventing the State# RealWorld parameter from being shared or discarded). This allows the data constructor to be safely exported.

1 Like

I always assumed the whole RealWorld thing was a GHC thing that made the compilation to Core better/more principled/something else nice somehow?

I find that (im)purity doesn’t really relate to IO at all. IO is pure (but those are theoretical technicalities).

My mental model is that there is:

  • evaluation of expressions (something you could do on a piece of paper if you wanted to)
  • execution of code (syscalls)

From a naive point of view, IO is the declarative boundary to execution. Although that is oversimplified, since we have rts shenanigans and allocations happening regardless of whether something is IO or not.

3 Likes

Huh, I wasn’t aware that data declarations don’t need a right-hand side.

If they did…it would make defining another abstract Haskell type rather “interesting” :

data (->) a b = ... {- ??? -}

…and for the same reason: just like I/O actions, the way ordinary Haskell functions can be constructed is invariably specific to each Haskell implementation.

As I mentioned here:

no-one seems to be mystified by how a type like (->) a b - whose expressions have no (externally-visible) effects - can be implemented in thoroughly-imperative effect-centric assembly code.

IO a is another one of Haskell’s types which happens to be defined abstractly…since 1996 (when the Haskell 1.3 report was published).

2 Likes