[ANN] data-elevator: Turn any lifted datatype into an unlifted dataype

I’ve just published the data-elevator package on hackage: data-elevator: Coerce between unlifted boxed and lifted types.

It’s the result of a joint brainstorming session at ZuriHac with Andreas Klebinger, Krzysztof Gogolewski and Matthew Craven (remotely). See this thread for details: Pre-proposal: Provide `unsafeForce :: forall (a :: Type) (b :: UnliftedType). a -> b` in Unsafe.Coerce (#21693) · Issues · Glasgow Haskell Compiler / GHC · GitLab

You can use its Strict :: LiftedType -> UnliftedType and Lazy :: LiftedType -> UnliftedType newtypes (!) to turn any lifted type into an unlifted (boxed) type like Array# and vice versa.

Example:

let f :: (a :: UnliftedType) -> Int
    f _ = 42
in f (Strict (error "boom" :: Int))

This will error out with "boom". While the following

let f :: a -> Int
    f _ = 42
in f (Lazy (error "boom" :: Array# Int))

will not error and return 42 instead. (Do note though that unlifted primitives like Array# currently don’t have enter code and any eval of Lazy will crash, Make everything BoxedRep have enter code (#21792) · Issues · Glasgow Haskell Compiler / GHC · GitLab).

I added a Coercible-like type, LevitySubsumption that allows to coerce some functions along the subkinding relationship Unlifted <: Lifted lifted over BoxedRep, TYPE and then (->). The only axioms are Strict a <: a and a <: Lazy a. With that in place, I can do call-by-value on any existing lazy function:

even :: Int -> Bool

strictEven :: Strict Int -> Bool
strictEven = levCoerce @(Int -> Bool) @(Strict Int -> Bool) even

even 42
strictEven (Strict 42)

See also Data.Elevator. It’s pretty interesting; constructing a coercion like that is very much what I would like GHC to do by itself one day, without any visible trace of levCoerce if possible (inferrable).

7 Likes

Looks fancy!

not clear what is going on

Is there an introduction for someone who was not looking into this «lifted» thing before? It is written well what this thing does, but it is not written why was it made. Overall, it looks like all the documentation was written for the «in» people. Surely you will have more engagement if you write for a wider audience.

  • How does it work — an overview?
  • What kind of goals does it help me achieve?
  • Are there any mathematical properties or theories to help me reason about it?

For example:

Turn a lifted data type into an unlifted one. Unlifted data types enjoy a call-by-value calling convention.

— This explanation is truly kmettian: it hints at something that an «in» person will immediately recognize, but it does nothing to either teach or direct to clarifying further reading. «call by value» and «calling convention» are both fairly complicated concepts that are best understood in a low level, procedural setting. It is not clear how type level programming in Haskell has anything to do with these concepts.

In the terms of the documentation system, you have some pieces of explanation sprinkled into your reference, but they would not work for anyone who might need an explanation!

not clear if there is any prior art

From the first sight, this looks to be working in the same direction as:

It would help me understand the value of this effort if the authors could give a short overview of how their solution improves upon the prior art.

3 Likes

I think the background information is in Levity polymorphism and more recently Kinds are calling conventions.

The problem with all existing ways to make values strict (deepseq, StrictData, strict-wrapper, (see below) bangpatterns etc.) is that they introduce calls to evaluate a value. This is not a big problem because a value only has to be evaluated once and so most of the time it is just a check that the value has already been evaluated. And these checks are cheap due to branch prediction, but they are not free, especially in tight loops where they take up valuable cache space.

Unlifted types are a solution to this problem where we encode the property that a value is always completely evaluated* in the types. Then the compiler doesn’t have to do all those checks. Until now one of the disadvantages has been that it is not easy to convert from a normal (lifted) type to an unlifted type.

* With the exception of any lifted data inside an unlifted data structure. Those parts do not have to be completely evaluated.

3 Likes

strict-wrapper doesn’t introduce calls to evaluate a value (except when you create it from a potentially-unevaluated value, which is presumably true of data-elevator too). On the other hand, the Strict (Maybe a) that it provides is not necessarily evaluated! The guarantee is that the Just payload (if any) is evaluated when the Strict (Maybe a) is.

Therefore I think that strict-wrapper and data-elevator do orthogonal things. strict-wrapper strictifies inside and data-elevator unlifts (kind of “strictifies outside”). (Perhaps data-elevator does the former too. I haven’t looked at more than the brief summary.)

1 Like

You’re right, I forgot about that. There are three possible issues with other techniques (in parentheses to which related work they apply):

  1. The extra evaluation calls I described above (bang patterns & seq).
  2. A deep traversal of the data structure (deepseq & strict-wrapper), this is obviously expensive for large data structures and GHC is bad at optimizing these deep traversals away. Be careful not to overuse these.
  3. Extra pointer indirection (StrictData & strict data type fields). This is not always a real problem, sometimes the indirection was necessary anyway. But you can’t just write data Strict a = MkStrict !a and get guaranteed performance benefits.

strict-wrapper doesn’t deep traverse, exactly to avoid the problem that deepseq has. (deepseq is rarely what you want in practice.) If you want nested strictness then you have to nest the Strict type family. For example

!a = Strict (Just (Just undefined))

does not error but

!a = Strict (Just (Strict (Just undefined)))

does.