Motivation
Since I have read the Linearly Quantified Types paper and Arnaud’s articles on Linear Constraints, I’ve been very excited about how we will be able to use linear constraints to formulate borrowing subslices of arrays within Linear Haskell.
Linear Constraint is not available in GHC today, as the Linear constraints proposal by Arnaud has just been discussed (I ed a lot ):
Although it is not available in the current GHC, we should still be able to simulate linear constraints by token data types, following the path in Arnaud’s article just the other way around.
To see this, I gave it a try to implement pure, parallel, and in-place FFT with today’s Linear Haskell (say GHC >= 9.0), inspired by the quicksort example in the paper (§4.2.3). I chose FFT because:
- FFT is another typical example of a divide-and-conquer algorithm, incorporating lending/borrowing subslice of an array.
- When I implemented FFT with ST-monad, I wanted to parallelise FFT in a type-safe manner and Linear Haskell seems to provide a means for it.
Result
So, here it is:
Here is the simplified version of the main loop:
loop :: RW n %1 -> SArray (Complex Double) n -> Int -> Double -> Double -> RW n
loop rw arr !n !c !s
if n <= 1 then rw else
halve rw arr & \(MkSlice sliced rwL rwR l r) ->
let !half = n `quot` 2
!dblCs = 2 * c * c - 1
!dblSn = 2 * s * c
!kW = c :+ s
divide = if n <= thresh then (,) else par
in divide
(loop rwL l half dblCs dblSn)
(loop rwR r half dblCs dblSn)
& \(rwL, rwR) ->
combine sliced rwL rwR l r & \(Ur arr, rw) ->
forN half
( \ !k (RW r w) ->
get r k arr & \(Ur ek, r) ->
get r (half + k) arr & \(Ur ok, r) ->
set (RW r w) k (ek + kW ^ k * ok) arr & \rw ->
set rw (half + k) (ek + kW ^ (half + k) * ok) arr
)
rw
The below is the example of the result of the above code:
And ThreadScope assures that the computation is indeed done parallelly, hurray!
Besides RW
s, it looks really simple and declarative implementation of FFT, parallelising works until the array length becomes below the prespecified threshold.
In the example above, RW
and SlicesTo
are the token datatype, mimicking the hypothetical corresponding linear constraints. So, once LinearConstraints gets implemented these arguments will mostly dissapear and handled implicitly by the compiler.
Anyway, here is the current token-based API (just the API described in the paper blindly translated to a token-based one):
data SArray a s -- Storable Array with elements of type a, s the unique Skolem variable
data R s -- ^ Read permission token for SArray a s
data W s -- ^ Write permission token for SArray a s
data RW s = RW (R s) (W s)
get ::
(SV.Storable a) =>
R s %1 -> Int -> SArray a s -> (Ur a, R s)
set ::
(SV.Storable a) =>
RW s %1 -> Int -> a -> SArray a s -> RW s
-- | The token witnessing @SArray a l@ (resp. @SArray a r@) is the first (resp. latter) half of the @SArray a s@.
data SlicesTo s l r
data Slice a s where
MkSlice ::
!(SlicesTo s l r) %1 ->
!(RW l) %1 ->
!(RW r) %1 ->
!(SArray a l) ->
!(SArray a r) ->
Slice a s
split ::
(SV.Storable a) =>
RW s %1 -> Int -> SArray a s -> Slice a s
combine ::
SlicesTo s l r %1 ->
RW l %1 ->
RW r %1 ->
SArray a l ->
SArray a r ->
(Ur (SArray a s), RW s)
halve ::
(SV.Storable a) =>
RW s %1 -> SArray a s -> Slice a s
Update: the original type of
set
was incorrect. Thank you @atravers for pointing this out!
Note that, concerning solely on a single array access, the token-based approach doesn’t reduce the burden of passing-around RW
, R
, and/or W
tokens (and Linear Constraints will help here). But slicing the array into two halves and lending to different function calls needs some bookkeeping with the Skolem variable trick, and the token-based approach can help even if LinearConstraints is unavailable. Without it, we should provide a specialised datatype for an array slice, requiring duplication of all the API for arrays for it.
Summary: even if the token-based approach is the tentative workaround until we get LinearConstraints, it still makes sense to adapt it when simulating uniform borrowing!
Notes on pure parallelism under destructive pure computations
The hardest part of the implementation was to avoid duplication of pure but destructive operations (such as writing to the array).
The subtlety is that now the seemingly pure term can contain destructive operations on arrays and if such ops are duplicated, the referential transparency collapses all at once!
To avoid this, we must use unsafePerformIO
instead of unsafeDupablePerformIO
in the write function for arrays (set
s).
But we also have to pay attention to the implementation of par
combinator:
par :: a %1 -> b %1 -> (a, b)
par a b = runEval C.do
a <- rpar a
b <- rpar b
a <- rseq a
b <- rseq b
C.pure (a, b)
Here, runEval
, rpar
and rseq
are custom linear-variant of those in parallel
package.
The key difference lies in the implementation of runEval
. In parallel
, it is implemented with unsafeDupablePerformIO
- this makes sense since, in the standard (non-linear) Haskell world, pure terms must not incorporate destructive operations.
But with our linear interface, OTOH, the set
function for arrays returns pure (token) value with the destructive operation under the hood. This means that if a call of runEval
(in par
) is evaluated more than once, then undesired writes to the same array could occur more than once, out of order!
So we had to implement runEval
in terms of unsafePerformIO
, not Dupable
one.
Conclusion
Although a little clumsy to pass around tokens, the token-based approach in Linear Haskell helps us to write pure array manipulation algorithms involving lending slices of arrays.
The result of FFT implementation looks concise and much alike traditional pure parallelism with par
s in traditional Haskell.
It is still clumsy to thread token values everywhere. Fortunately, if once LinearConstraints gets implemented, this burden goes away! So, let’s Linear constraints proposal:
Thank you Linear Haskell guys for pushing this forward!