[ANN] data-elevator: Turn any lifted datatype into an unlifted dataype

I’ve just published the data-elevator package on hackage: data-elevator: Coerce between unlifted boxed and lifted types.

It’s the result of a joint brainstorming session at ZuriHac with Andreas Klebinger, Krzysztof Gogolewski and Matthew Craven (remotely). See this thread for details: Pre-proposal: Provide `unsafeForce :: forall (a :: Type) (b :: UnliftedType). a -> b` in Unsafe.Coerce (#21693) · Issues · Glasgow Haskell Compiler / GHC · GitLab

You can use its Strict :: LiftedType -> UnliftedType and Lazy :: LiftedType -> UnliftedType newtypes (!) to turn any lifted type into an unlifted (boxed) type like Array# and vice versa.

Example:

let f :: (a :: UnliftedType) -> Int
    f _ = 42
in f (Strict (error "boom" :: Int))

This will error out with "boom". While the following

let f :: a -> Int
    f _ = 42
in f (Lazy (error "boom" :: Array# Int))

will not error and return 42 instead. (Do note though that unlifted primitives like Array# currently don’t have enter code and any eval of Lazy will crash, Make everything BoxedRep have enter code (#21792) · Issues · Glasgow Haskell Compiler / GHC · GitLab).

I added a Coercible-like type, LevitySubsumption that allows to coerce some functions along the subkinding relationship Unlifted <: Lifted lifted over BoxedRep, TYPE and then (->). The only axioms are Strict a <: a and a <: Lazy a. With that in place, I can do call-by-value on any existing lazy function:

even :: Int -> Bool

strictEven :: Strict Int -> Bool
strictEven = levCoerce @(Int -> Bool) @(Strict Int -> Bool) even

even 42
strictEven (Strict 42)

See also Data.Elevator. It’s pretty interesting; constructing a coercion like that is very much what I would like GHC to do by itself one day, without any visible trace of levCoerce if possible (inferrable).

9 Likes

Looks fancy!

not clear what is going on

Is there an introduction for someone who was not looking into this «lifted» thing before? It is written well what this thing does, but it is not written why was it made. Overall, it looks like all the documentation was written for the «in» people. Surely you will have more engagement if you write for a wider audience.

  • How does it work — an overview?
  • What kind of goals does it help me achieve?
  • Are there any mathematical properties or theories to help me reason about it?

For example:

Turn a lifted data type into an unlifted one. Unlifted data types enjoy a call-by-value calling convention.

— This explanation is truly kmettian: it hints at something that an «in» person will immediately recognize, but it does nothing to either teach or direct to clarifying further reading. «call by value» and «calling convention» are both fairly complicated concepts that are best understood in a low level, procedural setting. It is not clear how type level programming in Haskell has anything to do with these concepts.

In the terms of the documentation system, you have some pieces of explanation sprinkled into your reference, but they would not work for anyone who might need an explanation!

not clear if there is any prior art

From the first sight, this looks to be working in the same direction as:

It would help me understand the value of this effort if the authors could give a short overview of how their solution improves upon the prior art.

3 Likes

I think the background information is in Levity polymorphism and more recently Kinds are calling conventions.

The problem with all existing ways to make values strict (deepseq, StrictData, strict-wrapper, (see below) bangpatterns etc.) is that they introduce calls to evaluate a value. This is not a big problem because a value only has to be evaluated once and so most of the time it is just a check that the value has already been evaluated. And these checks are cheap due to branch prediction, but they are not free, especially in tight loops where they take up valuable cache space.

Unlifted types are a solution to this problem where we encode the property that a value is always completely evaluated* in the types. Then the compiler doesn’t have to do all those checks. Until now one of the disadvantages has been that it is not easy to convert from a normal (lifted) type to an unlifted type.

* With the exception of any lifted data inside an unlifted data structure. Those parts do not have to be completely evaluated.

3 Likes

strict-wrapper doesn’t introduce calls to evaluate a value (except when you create it from a potentially-unevaluated value, which is presumably true of data-elevator too). On the other hand, the Strict (Maybe a) that it provides is not necessarily evaluated! The guarantee is that the Just payload (if any) is evaluated when the Strict (Maybe a) is.

Therefore I think that strict-wrapper and data-elevator do orthogonal things. strict-wrapper strictifies inside and data-elevator unlifts (kind of “strictifies outside”). (Perhaps data-elevator does the former too. I haven’t looked at more than the brief summary.)

1 Like

You’re right, I forgot about that. There are four possible issues with other techniques (in parentheses to which related work they apply):

  1. The extra evaluation calls I described above (bang patterns & seq).
  2. A deep traversal of the data structure (deepseq), this is obviously expensive for large data structures and GHC is bad at optimizing these deep traversals away. Be careful not to overuse these.
  3. Extra pointer indirection (StrictData & strict data type fields). This is not always a real problem, sometimes the indirection was necessary anyway and sometimes GHC can optimize them. But you can’t just write data Strict a = MkStrict !a and get guaranteed performance benefits.
  4. Applies only to a manually specified set of container types (maybe it can be derived generically or via Template Haskell?). Only makes the contained elements strict, not the container itself. Has non-zero cost to convert between strict and lazy types. (strict-wrapper)

Edit: Thanks @tomjaguarpaw for the correction.

strict-wrapper doesn’t deep traverse, exactly to avoid the problem that deepseq has. (deepseq is rarely what you want in practice.) If you want nested strictness then you have to nest the Strict type family. For example

!a = Strict (Just (Just undefined))

does not error but

!a = Strict (Just (Strict (Just undefined)))

does.

1 Like

Now that I’m revisiting this, I think we could make a library similar to strict-wrapper but then also unlift (“strictify”) the container itself (the top level) and make sure that the elements have kind UnliftedType, so that the type system guides us to also unlift the element types (and their elements recursively). You would only be able to break that recursion if you mark something as Lazy explicitly.

I’ll try to write a prototype.

Edit: Here’s a proof of concept:

{-# LANGUAGE TypeFamilies, UnliftedDatatypes #-}

module T where

import GHC.List

import Data.Elevator
import Data.Kind
  
type family UnliftKind k where
  UnliftKind Type = UnliftedType
  UnliftKind (k1 -> k2) = UnliftKind k1 -> UnliftKind k2

class Unliftedly (t :: k) where
  type Unlifted t :: UnliftKind k

type UnliftedTuple :: UnliftedType -> UnliftedType -> UnliftedType
data UnliftedTuple a b = MkUnliftedTuple a b

instance Unliftedly (,) where
  type Unlifted (,) = UnliftedTuple

type UnliftedMaybe :: UnliftedType -> UnliftedType
data UnliftedMaybe a = UnliftedNothing | UnliftedJust a

instance Unliftedly Maybe where
  type Unlifted Maybe = UnliftedMaybe

million :: Integer
million = 1000000

foldlU :: forall a (b :: UnliftedType). (b -> a -> b) -> b -> [a] -> b
foldlU k z = go z where
  go s [] = s
  go s (x:xs) = k (go s xs) x

maybeFoldStrict :: Lazy (Unlifted (,) (Strict Integer) (Unlifted Maybe (Strict Integer)))
maybeFoldStrict = Lazy 
  (foldlU f (MkUnliftedTuple (Strict 0) UnliftedNothing) [1..million])
  where
    f :: Unlifted (,) (Strict Integer) (Unlifted Maybe (Strict Integer)) -> Integer -> Unlifted (,) (Strict Integer) (Unlifted Maybe (Strict Integer))
    f (MkUnliftedTuple (Strict i) UnliftedNothing) x = MkUnliftedTuple (Strict (i + 1)) (UnliftedJust (Strict x))
    f (MkUnliftedTuple (Strict i) (UnliftedJust (Strict j))) x = MkUnliftedTuple (Strict (i + 2)) (UnliftedJust (Strict (j + x)))
2 Likes

Cool idea! Though still a bit verbose for my tastes. Plus, although you have the Unlifted type class, you still have to write MkUnliftedTuple a b (which comes out of thin air) instead of … well I don’t know, I really just want (a, b), syntactically, and let the semantics be determined by the type’s kind.

I can see that it’s a clear upgrade path for strict-wrapper, though! You’d get strict-wrapper by wrapping your Unlifted ... things in a single top-level Lazy, I think (the sole purpose of which would be to be able to re-use foldl').

1 Like

I just realized I can make it much nicer:

{-# LANGUAGE TypeFamilies, UnliftedDatatypes, PatternSynonyms #-}
{-# LANGUAGE ViewPatterns #-}

module T where

import GHC.List

import Data.Elevator
  
class Unliftedly t where
  type Unlifted t :: UnliftedType
  matchUnlifted :: Unlifted t -> t
  constructUnlifted :: t -> Unlifted t

type UnliftedTuple :: UnliftedType -> UnliftedType -> UnliftedType
data UnliftedTuple a b = MkUnliftedTuple a b

instance (Unliftedly a, Unliftedly b) => Unliftedly (a, b) where
  type Unlifted (a, b) = UnliftedTuple (Unlifted a) (Unlifted b)
  matchUnlifted (MkUnliftedTuple x y) = (matchUnlifted x, matchUnlifted y)
  constructUnlifted (x, y) = MkUnliftedTuple (constructUnlifted x) (constructUnlifted y)

type UnliftedMaybe :: UnliftedType -> UnliftedType
data UnliftedMaybe a = UnliftedNothing | UnliftedJust a

instance Unliftedly a => Unliftedly (Maybe a) where
  type Unlifted (Maybe a) = UnliftedMaybe (Unlifted a)
  matchUnlifted UnliftedNothing = Nothing
  matchUnlifted (UnliftedJust x) = Just (matchUnlifted x)
  constructUnlifted Nothing = UnliftedNothing
  constructUnlifted (Just x) = UnliftedJust (constructUnlifted x)

instance Unliftedly Integer where
  type Unlifted Integer = Strict Integer
  matchUnlifted (Strict x) = x
  constructUnlifted x = Strict x

pattern Unlifted :: Unliftedly t => t -> Unlifted t
pattern Unlifted x <- (matchUnlifted -> x) where
  Unlifted = constructUnlifted

million :: Integer
million = 1000000

maybeFoldStrict :: Lazy (Unlifted (Integer, Maybe Integer))
maybeFoldStrict = foldl' f (Lazy (Unlifted @(Integer, Maybe Integer) (0, Nothing))) [1..million]
  where
    f :: Lazy (Unlifted (Integer, Maybe Integer)) -> Integer -> Lazy (Unlifted (Integer, Maybe Integer))
    f (Lazy (Unlifted @(Integer, Maybe Integer) (i, Nothing))) x = Lazy (Unlifted (i + 1, Just x))
    f (Lazy (Unlifted @(Integer, Maybe Integer) (i, Just j))) x = Lazy (Unlifted (i + 2, Just (j + x)))

Still a bit annoying that you need type applications.

1 Like